AITopics | relevant variable

Collaborating Authors

relevant variable

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Feature selection in functional data classification with recursive maxima hunting

José L. Torrecilla, Alberto Suárez

Neural Information Processing SystemsMar-23-2026, 04:08:52 GMT

Dimensionality reduction is one of the key issues in the design of effective machine learning methods for automatic induction. In this work, we introduce recursive maxima hunting (RMH) for variable selection in classification problems with functional data. In this context, variable selection techniques are especially attractive because they reduce the dimensionality, facilitate the interpretation and can improve the accuracy of the predictive models. The method, which is a recursive extension of maxima hunting (MH), performs variable selection by identifying the maxima of a relevance function, which measures the strength of the correlation of the predictor functional variable with the class label. At each stage, the information associated with the selected variable is removed by subtracting the conditional expectation of the process. The results of an extensive empirical evaluation are used to illustrate that, in the problems investigated, RMH has comparable or higher predictive accuracy than standard dimensionality reduction techniques, such as PCA and PLS, and state-of-the-art feature selection methods for functional data, such as maxima hunting.

artificial intelligence, classification, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.29)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Estimation in high-dimensional linear regression: Post-Double-Autometrics as an alternative to Post-Double-Lasso

Hué, Sullivan, Laurent, Sébastien, Aiounou, Ulrich, Flachaire, Emmanuel

arXiv.org Machine LearningNov-27-2025

Post-Double-Lasso is becoming the most popular method for estimating linear regression models with many covariates when the purpose is to obtain an accurate estimate of a parameter of interest, such as an average treatment effect. However, this method can suffer from substantial omitted variable bias in finite sample. We propose a new method called Post-Double-Autometrics, which is based on Autometrics, and show that this method outperforms Post-Double-Lasso.

covariate, post-double-autometric, post-double-lasso, (14 more...)

arXiv.org Machine Learning

2511.21257

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
South America > Brazil (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.68)
Education > Educational Setting > K-12 Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.68)

Add feedback

A Stable Lasso

Nouraie, Mahdi, Zhu, Houying, Muller, Samuel

arXiv.org Machine LearningNov-5-2025

The Lasso has been widely used as a method for variable selection, valued for its simplicity and empirical performance. However, Lasso's selection stability deteriorates in the presence of correlated predictors. Several approaches have been developed to mitigate this limitation. In this paper, we provide a brief review of existing approaches, highlighting their limitations. We then propose a simple technique to improve the selection stability of Lasso by integrating a weighting scheme into the Lasso penalty function, where the weights are defined as an increasing function of a correlation-adjusted ranking that reflects the predictive power of predictors. Empirical evaluations on both simulated and real-world datasets demonstrate the efficacy of the proposed method. Additional numerical results demonstrate the effectiveness of the proposed approach in stabilizing other regularization-based selection methods, indicating its potential as a general-purpose solution.

artificial intelligence, lasso, machine learning, (17 more...)

arXiv.org Machine Learning

2511.02306

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)

Add feedback

Variational Garrote for Statistical Physics-based Sparse and Robust Variable Selection

Soh, Hyungjoon, Lee, Dongha, Periwal, Vipul, Jo, Junghyo

arXiv.org Artificial IntelligenceSep-9-2025

Identifying relationships between variables is a fundamental task in science. Among various approaches, linear regression plays a central role in linking explanatory variables to dependent variables in statistical modeling [1, 2]. Linear regression is useful in physics [3, 4] for extracting equations of motion from time series data [5] and for predicting trends in dynamical systems [6], but its simplicity, interpretability, and predictive power make it a cornerstone of data analysis [7], forecasting [8], and decision-making [9] in many fields. Moreover, linear regression forms the foundation for many advanced statistical and machine learning models [10], including logistic regression [11], support vector machines [12], and generalized linear models [13]. Extensions of linear regression often aim to capture more complex relationships by introducing higher-order polynomial terms or additional nonlinear transformations. Modern developments in machine learning have enabled the training of deep and highly overparameterized models capable of modeling intricate patterns far beyond the reach of simple linear approaches. In particular, deep learning models can be interpreted as sophisticated forms of nonlinear regression [14], capable of approximating complex functions with high flexibility. Despite its utility, linear regression struggles with modern high-dimensional datasets where only a small subset of variables is truly informative.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.06383

Country: North America > United States (1.00)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Learning Juntas under Markov Random Fields

Chandrasekaran, Gautam, Klivans, Adam

arXiv.org Artificial IntelligenceJun-3-2025

We give an algorithm for learning $O(\log n)$ juntas in polynomial-time with respect to Markov Random Fields (MRFs) in a smoothed analysis framework where only the external field has been randomly perturbed. This is a broad generalization of the work of Kalai and Teng, who gave an algorithm that succeeded with respect to smoothed product distributions (i.e., MRFs whose dependency graph has no edges). Our algorithm has two phases: (1) an unsupervised structure learning phase and (2) a greedy supervised learning algorithm. This is the first example where algorithms for learning the structure of an undirected graphical model lead to provably efficient algorithms for supervised learning.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.00764

Country:

Europe (0.46)
North America > Canada (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Add feedback

A Unified Framework for Variable Selection in Model-Based Clustering with Missing Not at Random

Ho, Binh H., Chi, Long Nguyen, Nguyen, TrungTin, Nguyen, Binh T., Hoang, Van Ha, Drovandi, Christopher

arXiv.org Machine LearningMay-27-2025

Model-based clustering integrated with variable selection is a powerful tool for uncovering latent structures within complex data. However, its effectiveness is often hindered by challenges such as identifying relevant variables that define heterogeneous subgroups and handling data that are missing not at random, a prevalent issue in fields like transcriptomics. While several notable methods have been proposed to address these problems, they typically tackle each issue in isolation, thereby limiting their flexibility and adaptability. This paper introduces a unified framework designed to address these challenges simultaneously. Our approach incorporates a data-driven penalty matrix into penalized clustering to enable more flexible variable selection, along with a mechanism that explicitly models the relationship between missingness and latent class membership. We demonstrate that, under certain regularity conditions, the proposed framework achieves both asymptotic consistency and selection consistency, even in the presence of missing data. This unified strategy significantly enhances the capability and efficiency of model-based clustering, advancing methodologies for identifying informative variables that define homogeneous subgroups in the presence of complex missing data patterns. The performance of the framework, including its computational efficiency, is evaluated through simulations and demonstrated using both synthetic and real-world transcriptomic datasets.

artificial intelligence, machine learning, selection, (18 more...)

arXiv.org Machine Learning

2505.19093

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
Asia > Vietnam > Hồ Chí Minh City > Hồ Chí Minh City (0.04)
Oceania > Australia > Queensland (0.04)
Europe > France > Auvergne-Rhône-Alpes > Lyon > Lyon (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Speeding up approximate MAP by applying domain knowledge about relevant variables

Kwisthout, Johan, Schroeder, Andrew

arXiv.org Artificial IntelligenceDec-12-2024

The MAP problem in Bayesian networks is notoriously intractable, even when approximated. In an earlier paper we introduced the Most Frugal Explanation heuristic approach to solving MAP, by partitioning the set of intermediate variables (neither observed nor part of the MAP variables) into a set of relevant variables, which are marginalized out, and irrelevant variables, which will be assigned a sampled value from their domain. In this study we explore whether knowledge about which variables are relevant for a particular query (i.e., domain knowledge) speeds up computation sufficiently to beat both exact MAP as well as approximate MAP while giving reasonably accurate results. Our results are inconclusive, but also show that this probably depends on the specifics of the MAP query, most prominently the number of MAP variables.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.09264

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > Scotland > City of Edinburgh > Edinburgh (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)

Add feedback

Approximating the Number of Relevant Variables in a Parity Implies Proper Learning

Bshouty, Nader H., Haddad, George

arXiv.org Artificial IntelligenceJul-16-2024

Consider the model where we can access a parity function through random uniform labeled examples in the presence of random classification noise. In this paper, we show that approximating the number of relevant variables in the parity function is as hard as properly learning parities. More specifically, let $\gamma:{\mathbb R}^+\to {\mathbb R}^+$, where $\gamma(x) \ge x$, be any strictly increasing function. In our first result, we show that from any polynomial-time algorithm that returns a $\gamma$-approximation, $D$ (i.e., $\gamma^{-1}(d(f)) \leq D \leq \gamma(d(f))$), of the number of relevant variables~$d(f)$ for any parity $f$, we can, in polynomial time, construct a solution to the long-standing open problem of polynomial-time learning $k(n)$-sparse parities (parities with $k(n)\le n$ relevant variables), where $k(n) = \omega_n(1)$. In our second result, we show that from any $T(n)$-time algorithm that, for any parity $f$, returns a $\gamma$-approximation of the number of relevant variables $d(f)$ of $f$, we can, in polynomial time, construct a $poly(\Gamma(n))T(\Gamma(n)^2)$-time algorithm that properly learns parities, where $\Gamma(x)=\gamma(\gamma(x))$. If $T(\Gamma(n)^2)=\exp({o(n/\log n)})$, this would resolve another long-standing open problem of properly learning parities in the presence of random classification noise in time $\exp({o(n/\log n)})$.

algorithm, probability, random classification noise, (14 more...)

arXiv.org Artificial Intelligence

2407.11832

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > China > Heilongjiang Province > Daqing (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
(12 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Causal Abstractions of Linear Structural Causal Models

Massidda, Riccardo, Magliacane, Sara, Bacciu, Davide

arXiv.org Artificial IntelligenceJun-1-2024

The need for modelling causal knowledge at different levels of granularity arises in several settings. Causal Abstraction provides a framework for formalizing this problem by relating two Structural Causal Models at different levels of detail. Despite increasing interest in applying causal abstraction, e.g. in the interpretability of large machine learning models, the graphical and parametrical conditions under which a causal model can abstract another are not known. Furthermore, learning causal abstractions from data is still an open problem. In this work, we tackle both issues for linear causal models with linear abstraction functions. First, we characterize how the low-level coefficients and the abstraction function determine the high-level coefficients and how the high-level model constrains the causal ordering of low-level variables. Then, we apply our theoretical results to learn high-level and low-level causal models and their abstraction function from observational data. In particular, we introduce Abs-LiNGAM, a method that leverages the constraints induced by the learned high-level model and the abstraction function to speedup the recovery of the larger low-level model, under the assumption of non-Gaussian noise terms. In simulated settings, we show the effectiveness of learning causal abstractions from data and the potential of our method in improving scalability of causal discovery.

abstract model, abstraction function, bootstrap, (14 more...)

arXiv.org Artificial Intelligence

2406.00394

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e3796ae838835da0b6f6ea37bcf8bcb7-Reviews.html

Neural Information Processing SystemsMar-13-2024, 22:19:39 GMT

Variable importance measures are often used to highlight key predictor variables in tree-based ensemble models. However, there has generally been a lack of a theoretical understanding of these measures. In this paper, the authors study the theoretical properties of the Mean Decrease Impurity (MDI) importance measures (such as the Gini importance). Through most of the paper they use the Shanon entropy as the impurity measure but show that the theorems and results are applicable to any impurity measure. They begin with an asymptotic analysis of totally randomized fully-developed tree ensembles learned using an infinitely large ensemble.

ensemble, importance measure, importance score, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback