AITopics | stability measure

Collaborating Authors

stability measure

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SharperGeneralizationBoundsforPairwise Learning

Neural Information Processing SystemsFeb-11-2026, 02:27:28 GMT

We also introduce anew on-average stability measure to develop optimistic bounds in a low noise setting.

artificial intelligence, machine learning, stability, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Germany (0.04)
Asia > China (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Quantifying consistency and accuracy of Latent Dirichlet Allocation

Magsarjav, Saranzaya, Humphries, Melissa, Tuke, Jonathan, Mitchell, Lewis

arXiv.org Artificial IntelligenceNov-18-2025

Topic modelling in Natural Language Processing uncovers hidden topics in large, unlabelled text datasets. It is widely applied in fields such as information retrieval, content summarisation, and trend analysis across various disciplines. However, probabilistic topic models can produce different results when rerun due to their stochastic nature, leading to inconsistencies in latent topics. Factors like corpus shuffling, rare text removal, and document elimination contribute to these variations. This instability affects replicability, reliability, and interpretation, raising concerns about whether topic models capture meaningful topics or just noise. To address these problems, we defined a new stability measure that incorporates accuracy and consistency and uses the generative properties of LDA to generate a new corpus with ground truth. These generated corpora are run through LDA 50 times to determine the variability in the output. We show that LDA can correctly determine the underlying number of topics in the documents. We also find that LDA is more internally consistent, as the multiple reruns return similar topics; however, these topics are not the true topics.

artificial intelligence, natural language, similarity measure, (16 more...)

arXiv.org Artificial Intelligence

2511.1285

Country: Oceania > Australia (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Breach in the Shield: Unveiling the Vulnerabilities of Large Language Models

Dai, Runpeng, Yang, Run, Zhou, Fan, Zhu, Hongtu

arXiv.org Machine LearningMar-28-2025

Large Language Models (LLMs) and Vision-Language Models (VLMs) have become essential to general artificial intelligence, exhibiting remarkable capabilities in task understanding and problem-solving. However, the real-world reliability of these models critically depends on their stability, which remains an underexplored area. Despite their widespread use, rigorous studies examining the stability of LLMs under various perturbations are still lacking. In this paper, we address this gap by proposing a novel stability measure for LLMs, inspired by statistical methods rooted in information geometry. Our measure possesses desirable invariance properties, making it well-suited for analyzing model sensitivity to both parameter and input perturbations. To assess the effectiveness of our approach, we conduct extensive experiments on models ranging in size from 1.5B to 13B parameters. Our results demonstrate the utility of our measure in identifying salient parameters and detecting vulnerable regions in input images or critical dimensions in token embeddings. Furthermore, leveraging our stability framework, we enhance model robustness during model merging, leading to improved performance.

arxiv preprint arxiv, large language model, natural language, (16 more...)

arXiv.org Machine Learning

2504.03714

Country:

North America > United States > District of Columbia > Washington (0.05)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report > New Finding (0.86)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

On the Selection Stability of Stability Selection and Its Applications

Nouraie, Mahdi, Muller, Samuel

arXiv.org Machine LearningNov-13-2024

Stability selection is a widely adopted resampling-based framework for high-dimensional structure estimation and variable selection. However, the concept of 'stability' is often narrowly addressed, primarily through examining selection frequencies, or 'stability paths'. This paper seeks to broaden the use of an established stability estimator to evaluate the overall stability of the stability selection framework, moving beyond single-variable analysis. We suggest that the stability estimator offers two advantages: it can serve as a reference to reflect the robustness of the outcomes obtained and help identify an optimal regularization value to improve stability. By determining this value, we aim to calibrate key stability selection parameters, namely, the decision threshold and the expected number of falsely selected variables, within established theoretical bounds. Furthermore, we explore a novel selection criterion based on this regularization value. With the asymptotic distribution of the stability estimator previously established, convergence to true stability is ensured, allowing us to observe stability trends over successive sub-samples. This approach sheds light on the required number of sub-samples addressing a notable gap in prior studies. The 'stabplot' package is developed to facilitate the use of the plots featured in this manuscript, supporting their integration into further statistical analysis and research workflows.

selection, stability, stability selection, (15 more...)

arXiv.org Machine Learning

2411.09097

Country: Europe > Switzerland > Vaud > Lausanne (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Multi-Wheeled Passive Sliding with Fully-Actuated Aerial Robots: Tip-Over Recovery and Avoidance

Hui, Tong, Cuniato, Eugenio, Pantic, Michael, Ghielmini, Jefferson, Lanegger, Christian, Papageorgiou, Dimitrios, Tognon, Marco, Siegwart, Roland, Fumagalli, Matteo

arXiv.org Artificial IntelligenceMay-28-2024

Push-and-slide tasks carried out by fully-actuated aerial robots can be used for inspection and simple maintenance tasks at height, such as non-destructive testing and painting. Often, an end-effector based on multiple non-actuated contact wheels is used to contact the surface. This approach entails challenges in ensuring consistent wheel contact with a surface whose exact orientation and location might be uncertain due to sensor aliasing and drift. Using a standard full-pose controller dependent on the inaccurate surface position and orientation may cause wheels to lose contact during sliding, and subsequently lead to robot tip-over. To address the tip-over issue, we present two approaches: (1) tip-over avoidance guidelines for hardware design, and (2) control for tip-over recovery and avoidance. Physical experiments with a fully-actuated aerial vehicle were executed for a push-and-slide task on a flat surface. The resulting data is used in deriving tip-over avoidance guidelines and designing a simulator that closely captures real-world conditions. We then use the simulator to test the effectiveness and robustness of the proposed approaches in risky scenarios against uncertainties.

aerial vehicle, scenario, stability measure, (15 more...)

arXiv.org Artificial Intelligence

2405.17844

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Denmark (0.04)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)

Genre: Research Report (0.50)

Industry:

Energy (0.67)
Aerospace & Defense (0.46)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Stability-Based Model Selection

Neural Information Processing SystemsApr-6-2023, 16:22:15 GMT

Model selection is linked to model assessment, which is the problem of comparing different models, or model parameters, for a specific learning task. For supervised learning, the standard practical technique is cross- validation, which is not applicable for semi-supervised and unsupervised settings. In this paper, a new model assessment scheme is introduced which is based on a notion of stability. The stability measure yields an upper bound to cross-validation in the supervised case, but extends to semi-supervised and unsupervised problems. In the experimental part, the performance of the stability measure is studied for model order se- lection in comparison to standard techniques in this area.

artificial intelligence, machine learning, stability-based model selection, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Minimax Optimal Estimation of Stability Under Distribution Shift

Namkoong, Hongseok, Ma, Yuanzhe, Glynn, Peter W.

arXiv.org Artificial IntelligenceDec-12-2022

The performance of decision policies and prediction models often deteriorates when applied to environments different from the ones seen during training. To ensure reliable operation, we propose and analyze the stability of a system under distribution shift, which is defined as the smallest change in the underlying environment that causes the system's performance to deteriorate beyond a permissible threshold. In contrast to standard tail risk measures and distributionally robust losses that require the specification of a plausible magnitude of distribution shift, the stability measure is defined in terms of a more intuitive quantity: the level of acceptable performance degradation. We develop a minimax optimal estimator of stability and analyze its convergence rate, which exhibits a fundamental phase shift behavior. Our characterization of the minimax convergence rate shows that evaluating stability against large performance degradation incurs a statistical cost. Empirically, we demonstrate the practical utility of our stability framework by using it to compare system designs on problems where robustness to distribution shift is critical.

distribution shift, machine learning, reinforcement learning, (22 more...)

arXiv.org Artificial Intelligence

2212.06338

Country:

North America > United States > New York (0.04)
South America (0.04)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Banking & Finance (0.92)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.81)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
(3 more...)

Add feedback

Employing an Adjusted Stability Measure for Multi-Criteria Model Fitting on Data Sets with Similar Features

Bommert, Andrea, Rahnenführer, Jörg, Lang, Michel

arXiv.org Machine LearningJun-15-2021

Fitting models with high predictive accuracy that include all relevant but no irrelevant or redundant features is a challenging task on data sets with similar (e.g. highly correlated) features. We propose the approach of tuning the hyperparameters of a predictive model in a multi-criteria fashion with respect to predictive accuracy and feature selection stability. We evaluate this approach based on both simulated and real data sets and we compare it to the standard approach of single-criteria tuning of the hyperparameters as well as to the state-of-the-art technique "stability selection". We conclude that our approach achieves the same or better predictive performance compared to the two established approaches. Considering the stability during tuning does not decrease the predictive accuracy of the resulting models. Our approach succeeds at selecting the relevant features while avoiding irrelevant or redundant features. The single-criteria approach fails at avoiding irrelevant or redundant features and the stability selection approach fails at selecting enough relevant features for achieving acceptable predictive accuracy. For our approach, for data sets with many similar features, the feature selection stability must be evaluated with an adjusted stability measure, that is, a measure that considers similarities between features. For data sets with only few similar features, an unadjusted stability measure suffices and is faster to compute.

accuracy, predictive accuracy, similar feature, (14 more...)

arXiv.org Machine Learning

2106.08105

Country:

Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Adjusted Measures for Feature Selection Stability for Data Sets with Similar Features

Bommert, Andrea, Rahnenführer, Jörg

arXiv.org Machine LearningSep-25-2020

For data sets with similar features, for example highly correlated features, most existing stability measures behave in an undesired way: They consider features that are almost identical but have different identifiers as different features. Existing adjusted stability measures, that is, stability measures that take into account the similarities between features, have major theoretical drawbacks. We introduce new adjusted stability measures that overcome these drawbacks. We compare them to each other and to existing stability measures based on both artificial and real sets of selected features. Based on the results, we suggest using one new stability measure that considers highly similar features as exchangeable.

artificial intelligence, machine learning, stability measure, (10 more...)

arXiv.org Machine Learning

2009.12075

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Cluster Validation In Unsupervised Machine Learning

#artificialintelligenceMay-16-2017, 23:15:33 GMT

Just look at it: it not only gives you a summary of all the specified validation measures across different clustering algorithms and number of inspected clusters, but also it lists those algorithms and number of clusters pairs that performed best in regard to a given validation metric.

algorithm, artificial intelligence, machine learning, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback