AITopics | best subset selection

Collaborating Authors

best subset selection

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continuous Optimization for Offline Change Point Detection and Estimation

Reimann, Hans, Moka, Sarat, Sofronov, Georgy

arXiv.org Machine LearningJul-2-2024

Change point detection and estimation are an incredibly diverse and widely scattered field in applied and mathematical statistics, with a large variety of applications. To provide a high-level intuition, change point detection may be understood as a signal processing tool for identifying abrupt changes in the generative parameters of a data sequence. While a strong line of work in change point detection is well established with Page's pioneering work (see Page [1954]) and rigorous results by Chernoff and Zacks [1964], Lorden [1971] and Sen and Srivastava [1975], many aspects of this problem are open and the general understanding of good solutions depends strongly on the problem at hand Niu et al. [2016], Truong et al. [2020], and Ma et al. [2020]. Among the open research questions, the simultaneous detection of multiple change points in large data sets is of major interest. Taking a machine learning and data scientific motivated approach, in this paper, we explore the applicability of recent advances in best subset selection of covariates in linear regression proposed by Moka et al. [2024]. This method, a continuous optimization approach for best subset selection, claims to offer faster performance compared to existing exhaustive search methods, while maintaining comparable accuracy.

change point, change point detection, detection, (9 more...)

arXiv.org Machine Learning

2407.03383

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Cost-Sensitive Best Subset Selection for Logistic Regression: A Mixed-Integer Conic Optimization Perspective

Knauer, Ricardo, Rodner, Erik

arXiv.org Artificial IntelligenceOct-9-2023

A key challenge in machine learning is to design interpretable models that can reduce their inputs to the best subset for making transparent predictions, especially in the clinical domain. In this work, we propose a certifiably optimal feature selection procedure for logistic regression from a mixed-integer conic optimization perspective that can take an auxiliary cost to obtain features into account. Based on an extensive review of the literature, we carefully create a synthetic dataset generator for clinical prognostic model research. This allows us to systematically evaluate different heuristic and optimal cardinality- and budget-constrained feature selection procedures. The analysis shows key limitations of the methods for the low-data regime and when confronted with label noise. Our paper not only provides empirical recommendations for suitable methods and dataset designs, but also paves the way for future research in the area of meta-learning.

best subset selection, selection, subset selection, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-42608-7_10

2310.05464

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

A Consistent and Scalable Algorithm for Best Subset Selection in Single Index Models

Tang, Borui, Zhu, Jin, Zhu, Junxian, Wang, Xueqin, Zhang, Heping

arXiv.org Machine LearningSep-12-2023

Analysis of high-dimensional data has led to increased interest in both single index models (SIMs) and best subset selection. SIMs provide an interpretable and flexible modeling framework for high-dimensional data, while best subset selection aims to find a sparse model from a large set of predictors. However, best subset selection in high-dimensional models is known to be computationally intractable. Existing methods tend to relax the selection, but do not yield the best subset solution. In this paper, we directly tackle the intractability by proposing the first provably scalable algorithm for best subset selection in high-dimensional SIMs. Our algorithmic solution enjoys the subset selection consistency and has the oracle property with a high probability. The algorithm comprises a generalized information criterion to determine the support size of the regression coefficients, eliminating the model selection tuning. Moreover, our method does not assume an error distribution or a specific link function and hence is flexible to apply. Extensive simulation results demonstrate that our method is not only computationally efficient but also able to exactly recover the best subset in various settings (e.g., linear regression, Poisson regression, heteroscedastic models).

algorithm, selection, selection consistency, (15 more...)

arXiv.org Machine Learning

2309.0623

Country:

Asia > China (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Add feedback