Goto

Collaborating Authors

 test score 0


FeatureCuts: Feature Selection for Large Data by Optimizing the Cutoff

Hu, Andy, Prasad, Devika, Pizzato, Luiz, Foord, Nicholas, Abrahamyan, Arman, Leontjeva, Anna, Doyle, Cooper, Jermyn, Dan

arXiv.org Artificial Intelligence

--In machine learning, the process of feature selection involves finding a reduced subset of features that captures most of the information required to train an accurate and efficient model. This work presents FeatureCuts, a novel feature selection algorithm that adaptively selects the optimal feature cutoff after performing filter ranking. Evaluated on 14 publicly available datasets and one industry dataset, FeatureCuts achieved, on average, 15 percentage points more feature reduction and up to 99.6% less computation time while maintaining model performance, compared to existing state-of-the-art methods. When the selected features are used in a wrapper method such as Particle Swarm Optimization (PSO), it enables 25 percentage points more feature reduction, requires 66% less computation time, and maintains model performance when compared to PSO alone. The minimal overhead of FeatureCuts makes it scalable for large datasets typically seen in enterprise applications. Traditional machine learning methods work best when their prediction signals come from data with a small, but highly informative set of features.


Nonlinear Feature Aggregation: Two Algorithms driven by Theory

Bonetti, Paolo, Metelli, Alberto Maria, Restelli, Marcello

arXiv.org Artificial Intelligence

Many real-world machine learning applications are characterized by a huge number of features, leading to computational and memory issues, as well as the risk of overfitting. Ideally, only relevant and non-redundant features should be considered to preserve the complete information of the original data and limit the dimensionality. Dimensionality reduction and feature selection are common preprocessing techniques addressing the challenge of efficiently dealing with high-dimensional data. Dimensionality reduction methods control the number of features in the dataset while preserving its structure and minimizing information loss. Feature selection aims to identify the most relevant features for a task, discarding the less informative ones. Previous works have proposed approaches that aggregate features depending on their correlation without discarding any of them and preserving their interpretability through aggregation with the mean. A limitation of methods based on correlation is the assumption of linearity in the relationship between features and target. In this paper, we relax such an assumption in two ways. First, we propose a bias-variance analysis for general models with additive Gaussian noise, leading to a dimensionality reduction algorithm (NonLinCFA) which aggregates non-linear transformations of features with a generic aggregation function. Then, we extend the approach assuming that a generalized linear model regulates the relationship between features and target. A deviance analysis leads to a second dimensionality reduction algorithm (GenLinCFA), applicable to a larger class of regression problems and classification settings. Finally, we test the algorithms on synthetic and real-world datasets, performing regression and classification tasks, showing competitive performances.