FeatureCuts: Feature Selection for Large Data by Optimizing the Cutoff
Hu, Andy, Prasad, Devika, Pizzato, Luiz, Foord, Nicholas, Abrahamyan, Arman, Leontjeva, Anna, Doyle, Cooper, Jermyn, Dan
–arXiv.org Artificial Intelligence
--In machine learning, the process of feature selection involves finding a reduced subset of features that captures most of the information required to train an accurate and efficient model. This work presents FeatureCuts, a novel feature selection algorithm that adaptively selects the optimal feature cutoff after performing filter ranking. Evaluated on 14 publicly available datasets and one industry dataset, FeatureCuts achieved, on average, 15 percentage points more feature reduction and up to 99.6% less computation time while maintaining model performance, compared to existing state-of-the-art methods. When the selected features are used in a wrapper method such as Particle Swarm Optimization (PSO), it enables 25 percentage points more feature reduction, requires 66% less computation time, and maintains model performance when compared to PSO alone. The minimal overhead of FeatureCuts makes it scalable for large datasets typically seen in enterprise applications. Traditional machine learning methods work best when their prediction signals come from data with a small, but highly informative set of features.
arXiv.org Artificial Intelligence
Aug-5-2025
- Country:
- Asia > Singapore (0.04)
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Netherlands > North Holland
- Amsterdam (0.04)
- Belgium > Brussels-Capital Region
- North America > United States
- Massachusetts > Suffolk County > Boston (0.04)
- Oceania > Australia
- New South Wales > Sydney (0.05)
- South Australia > Adelaide (0.04)
- Western Australia > Perth (0.04)
- Genre:
- Research Report (1.00)
- Technology: