Learned Feature Importance Scores for Automated Feature Engineering

التفاصيل البيبلوغرافية
العنوان: Learned Feature Importance Scores for Automated Feature Engineering
المؤلفون: Dong, Yihe, Arik, Sercan, Yoder, Nathanael, Pfister, Tomas
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Machine Learning
الوصف: Feature engineering has demonstrated substantial utility for many machine learning workflows, such as in the small data regime or when distribution shifts are severe. Thus automating this capability can relieve much manual effort and improve model performance. Towards this, we propose AutoMAN, or Automated Mask-based Feature Engineering, an automated feature engineering framework that achieves high accuracy, low latency, and can be extended to heterogeneous and time-varying data. AutoMAN is based on effectively exploring the candidate transforms space, without explicitly manifesting transformed features. This is achieved by learning feature importance masks, which can be extended to support other modalities such as time series. AutoMAN learns feature transform importance end-to-end, incorporating a dataset's task target directly into feature engineering, resulting in state-of-the-art performance with significantly lower latency compared to alternatives.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2406.04153
رقم الأكسشن: edsarx.2406.04153
قاعدة البيانات: arXiv