AITopics | model stability

Collaborating Authors

model stability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CUFG: Curriculum Unlearning Guided by the Forgetting Gradient

Miao, Jiaxing, Hu, Liang, Zhang, Qi, Yuan, Lai Zhong, Naseem, Usman

arXiv.org Artificial IntelligenceSep-19-2025

As privacy and security take center stage in AI, machine unlearning, the ability to erase specific knowledge from models, has garnered increasing attention. However, existing methods overly prioritize efficiency and aggressive forgetting, which introduces notable limitations. In particular, radical interventions like gradient ascent, influence functions, and random label noise can destabilize model weights, leading to collapse and reduced reliability. To address this, we propose CUFG (Curriculum Unlearning via Forgetting Gradients), a novel framework that enhances the stability of approximate unlearning through innovations in both forgetting mechanisms and data scheduling strategies. Specifically, CUFG integrates a new gradient corrector guided by forgetting gradients for fine-tuning-based unlearning and a curriculum unlearning paradigm that progressively forgets from easy to hard. These innovations narrow the gap with the gold-standard Retrain method by enabling more stable and progressive unlearning, thereby improving both effectiveness and reliability. Furthermore, we believe that the concept of curriculum unlearning has substantial research potential and offers forward-looking insights for the development of the MU field. Extensive experiments across various forgetting scenarios validate the rationale and effectiveness of our approach and CUFG. Codes are available at https://anonymous.4open.science/r/CUFG-6375.

artificial intelligence, machine learning, proceedings, (18 more...)

arXiv.org Artificial Intelligence

2509.14633

Genre: Research Report > Promising Solution (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Understanding Generalization of Federated Learning: the Trade-off between Model Stability and Optimization

Zeng, Dun, Wu, Zheshun, Liu, Shiyu, Pan, Yu, Tang, Xiaoying, Xu, Zenglin

arXiv.org Machine LearningNov-25-2024

Federated Learning (FL) is a distributed learning approach that trains neural networks across multiple devices while keeping their local data private. However, FL often faces challenges due to data heterogeneity, leading to inconsistent local optima among clients. These inconsistencies can cause unfavorable convergence behavior and generalization performance degradation. Existing studies mainly describe this issue through \textit{convergence analysis}, focusing on how well a model fits training data, or through \textit{algorithmic stability}, which examines the generalization gap. However, neither approach precisely captures the generalization performance of FL algorithms, especially for neural networks. In this paper, we introduce the first generalization dynamics analysis framework in federated optimization, highlighting the trade-offs between model stability and optimization. Through this framework, we show how the generalization of FL algorithms is affected by the interplay of algorithmic stability and optimization. This framework applies to standard federated optimization and its advanced versions, like server momentum. We find that fast convergence from large local steps or accelerated momentum enlarges stability but obtains better generalization performance. Our insights into these trade-offs can guide the practice of future algorithms for better generalization.

excess risk, model stability, stability, (14 more...)

arXiv.org Machine Learning

2411.16303

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Add feedback

Resource Allocation for Stable LLM Training in Mobile Edge Computing

Liu, Chang, Zhao, Jun

arXiv.org Artificial IntelligenceSep-30-2024

As mobile devices increasingly become focal points for advanced applications, edge computing presents a viable solution to their inherent computational limitations, particularly in deploying large language models (LLMs). However, despite the advancements in edge computing, significant challenges remain in efficient training and deploying LLMs due to the computational demands and data privacy concerns associated with these models. This paper explores a collaborative training framework that integrates mobile users with edge servers to optimize resource allocation, thereby enhancing both performance and efficiency. Our approach leverages parameter-efficient fine-tuning (PEFT) methods, allowing mobile users to adjust the initial layers of the LLM while edge servers handle the more demanding latter layers. Specifically, we formulate a multi-objective optimization problem to minimize the total energy consumption and delay during training. We also address the common issue of instability in model performance by incorporating stability enhancements into our objective function. Through novel fractional programming technique, we achieve a stationary point for the formulated problem. Simulations demonstrate that our method reduces the energy consumption as well as the latency, and increases the reliability of LLMs across various mobile settings.

edge server, model stability, stability, (8 more...)

arXiv.org Artificial Intelligence

2409.20247

Country:

Europe > Greece > Attica > Athens (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

FINEST: Stabilizing Recommendations by Rank-Preserving Fine-Tuning

Oh, Sejoon, Ustun, Berk, McAuley, Julian, Kumar, Srijan

arXiv.org Artificial IntelligenceFeb-5-2024

Modern recommender systems may output considerably different recommendations due to small perturbations in the training data. Changes in the data from a single user will alter the recommendations as well as the recommendations of other users. In applications like healthcare, housing, and finance, this sensitivity can have adverse effects on user experience. We propose a method to stabilize a given recommender system against such perturbations. This is a challenging task due to (1) the lack of a ``reference'' rank list that can be used to anchor the outputs; and (2) the computational challenges in ensuring the stability of rank lists with respect to all possible perturbations of training data. Our method, FINEST, overcomes these challenges by obtaining reference rank lists from a given recommendation model and then fine-tuning the model under simulated perturbation scenarios with rank-preserving regularization on sampled items. Our experiments on real-world datasets demonstrate that FINEST can ensure that recommender models output stable recommendations under a wide range of different perturbations without compromising next-item prediction accuracy.

finest, perturbation, rank list, (13 more...)

arXiv.org Artificial Intelligence

2402.03481

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LSTM based models stability in the context of Sentiment Analysis for social media

Haddaoui, Bousselham El, Chiheb, Raddouane, Faizi, Rdouan, Afia, Abdellatif El

arXiv.org Artificial IntelligenceNov-21-2022

Deep learning techniques have proven their effectiveness for Sentiment Analysis (SA) related tasks. Recurrent neural networks (RNN), especially Long Short-Term Memory (LSTM) and Bidirectional LSTM, have become a reference for building accurate predictive models. However, the models complexity and the number of hyperparameters to configure raises several questions related to their stability. In this paper, we present various LSTM models and their key parameters, and we perform experiments to test the stability of these models in the context of Sentiment Analysis.

artificial intelligence, machine learning, sentiment analysis, (17 more...)

arXiv.org Artificial Intelligence

2211.11246

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
(2 more...)

Genre: Research Report (0.85)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Instability in clinical risk stratification models using deep learning

Lopez-Martinez, Daniel, Yakubovich, Alex, Seneviratne, Martin, Lelkes, Adam D., Tyagi, Akshit, Kemp, Jonas, Steinberg, Ethan, Downing, N. Lance, Li, Ron C., Morse, Keith E., Shah, Nigam H., Chen, Ming-Jun

arXiv.org Artificial IntelligenceNov-19-2022

While it has been well known in the ML community that deep learning models suffer from instability, the consequences for healthcare deployments are under characterised. We study the stability of different model architectures trained on electronic health records, using a set of outpatient prediction tasks as a case study. We show that repeated training runs of the same deep learning model on the same training data can result in significantly different outcomes at a patient level even though global performance metrics remain stable. We propose two stability metrics for measuring the effect of randomness of model training, as well as mitigation strategies for improving model stability.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2211.10828

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback