FAMOSE: A ReAct Approach to Automated Feature Discovery
2026-02-19 • Machine Learning
Machine LearningArtificial Intelligence
AI summaryⓘ
The authors present FAMOSE, a new tool that helps automatically create and choose the best features for machine learning models, especially for data organized in tables. Unlike traditional methods that need lots of expert input, FAMOSE uses a smart agent based on the ReAct framework to try out and improve features by learning from past successes and failures. Their tests show FAMOSE performs very well for both classification and regression tasks, often beating existing methods. This suggests that the agent approach helps the system come up with better and more creative features on its own.
Feature EngineeringTabular DataMachine LearningReAct FrameworkFeature SelectionClassificationRegressionLarge Language Models (LLMs)ROC-AUCRMSE
Authors
Keith Burghardt, Jienan Liu, Sadman Sakib, Yuning Hao, Bo Li
Abstract
Feature engineering remains a critical yet challenging bottleneck in machine learning, particularly for tabular data, as identifying optimal features from an exponentially large feature space traditionally demands substantial domain expertise. To address this challenge, we introduce FAMOSE (Feature AugMentation and Optimal Selection agEnt), a novel framework that leverages the ReAct paradigm to autonomously explore, generate, and refine features while integrating feature selection and evaluation tools within an agent architecture. To our knowledge, FAMOSE represents the first application of an agentic ReAct framework to automated feature engineering, especially for both regression and classification tasks. Extensive experiments demonstrate that FAMOSE is at or near the state-of-the-art on classification tasks (especially tasks with more than 10K instances, where ROC-AUC increases 0.23% on average), and achieves the state-of-the-art for regression tasks by reducing RMSE by 2.0% on average, while remaining more robust to errors than other algorithms. We hypothesize that FAMOSE's strong performance is because ReAct allows the LLM context window to record (via iterative feature discovery and evaluation steps) what features did or did not work. This is similar to a few-shot prompt and guides the LLM to invent better, more innovative features. Our work offers evidence that AI agents are remarkably effective in solving problems that require highly inventive solutions, such as feature engineering.