AITopics | loss gap

Collaborating Authors

loss gap

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SoftAdaClip: A Smooth Clipping Strategy for Fair and Private Model Training

Soleymani, Dorsa, Dadsetan, Ali, Rudzicz, Frank

arXiv.org Artificial IntelligenceOct-8-2025

Differential privacy (DP) provides strong protection for sensitive data, but often reduces model performance and fairness, especially for underrepresented groups. One major reason is gradient clipping in DP-SGD, which can disproportionately suppress learning signals for minority subpopulations. Although adaptive clipping can enhance utility, it still relies on uniform hard clipping, which may restrict fairness. To address this, we introduce SoftAdaClip, a differentially private training method that replaces hard clipping with a smooth, tanh-based transformation to preserve relative gradient magnitudes while bounding sensitivity. We evaluate SoftAdaClip on various datasets, including MIMIC-III (clinical text), GOSSIS-eICU (structured healthcare), and Adult Income (tabular data). Our results show that SoftAdaClip reduces subgroup disparities by up to 87% compared to DP-SGD and up to 48% compared to Adaptive-DPSGD, and these reductions in subgroup disparities are statistically significant. These findings underscore the importance of integrating smooth transformations with adaptive mechanisms to achieve fair and private model training.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2510.01447

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Providers & Services (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

b0928f2d4ba7ea33b05024f21d937f48-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 21:38:55 GMT

artificial intelligence, log 1, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fundamental Safety-Capability Trade-offs in Fine-tuning Large Language Models

Chen, Pin-Yu, Shen, Han, Das, Payel, Chen, Tianyi

arXiv.org Machine LearningMar-24-2025

Fine-tuning Large Language Models (LLMs) on some task-specific datasets has been a primary use of LLMs. However, it has been empirically observed that this approach to enhancing capability inevitably compromises safety, a phenomenon also known as the safety-capability trade-off in LLM fine-tuning. This paper presents a theoretical framework for understanding the interplay between safety and capability in two primary safety-aware LLM fine-tuning strategies, providing new insights into the effects of data similarity, context overlap, and alignment loss landscape. Our theoretical results characterize the fundamental limits of the safety-capability trade-off in LLM fine-tuning, which are also validated by numerical experiments.

large language model, machine learning, natural language, (15 more...)

arXiv.org Machine Learning

2503.20807

Country: North America > United States > New York (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Defending Membership Inference Attacks via Privacy-aware Sparsity Tuning

Hu, Qiang, Zhang, Hengxiang, Wei, Hongxin

arXiv.org Artificial IntelligenceOct-9-2024

Over-parameterized models are typically vulnerable to membership inference attacks, which aim to determine whether a specific sample is included in the training of a given model. In light of this, we propose Privacy-aware Sparsity Tuning (PAST)--a simple fix to the l1 Regularization--by employing adaptive penalties to different parameters. Our key idea behind PAST is to promote sparsity in parameters that significantly contribute to privacy leakage. In particular, we construct the adaptive weight for each parameter based on its privacy sensitivity, i.e., the gradient of the loss gap with respect to the parameter. Using PAST, the network shrinks the loss gap between members and non-members, leading to strong resistance to privacy attacks. Extensive experiments demonstrate the superiority of PAST, achieving a state-of-the-art balance in the privacy-utility trade-off. Modern neural networks are trained in an over-parameterized regime where the parameters of the model exceed the size of the training set (Zhang et al., 2021). While the huge amount of parameters empowers the models to achieve impressive performance across various tasks, the strong capacity also makes them particularly vulnerable to membership inference attacks (MIAs) (Shokri et al., 2017). In MIAs, attackers aim to detect if a sample is utilized in the training of a target model. Membership inference can cause security and privacy concerns in cases where the target model is trained on sensitive information, like health care (Paul et al., 2021), financial service (Mahalle et al., 2018), and DNA sequence analysis (Arshad et al., 2021).

loss gap, module, regularization, (13 more...)

arXiv.org Artificial Intelligence

2410.06814

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Texas (0.05)
Europe > Poland > Masovia Province > Warsaw (0.04)
Europe > France > Grand Est > Bas-Rhin > Strasbourg (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Challenges in Detoxifying Language Models

Welbl, Johannes, Glaese, Amelia, Uesato, Jonathan, Dathathri, Sumanth, Mellor, John, Hendricks, Lisa Anne, Anderson, Kirsty, Kohli, Pushmeet, Coppin, Ben, Huang, Po-Sen

arXiv.org Artificial IntelligenceSep-15-2021

Large language models (LM) generate remarkably fluent text and can be efficiently adapted across NLP tasks. Measuring and guaranteeing the quality of generated text in terms of safety is imperative for deploying LMs in the real world; to this end, prior work often relies on automatic evaluation of LM toxicity. We critically discuss this approach, evaluate several toxicity mitigation strategies with respect to both automatic and human evaluation, and analyze consequences of toxicity mitigation in terms of model bias and LM quality. We demonstrate that while basic intervention strategies can effectively optimize previously established automatic metrics on the RealToxicityPrompts dataset, this comes at the cost of reduced LM coverage for both texts about, and dialects of, marginalized groups. Additionally, we find that human raters often disagree with high automatic toxicity scores after strong toxicity reduction interventions -- highlighting further the nuances involved in careful evaluation of LM toxicity.

continuation, evaluation, toxicity, (16 more...)

arXiv.org Artificial Intelligence

2109.07445

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry:

Media (0.46)
Leisure & Entertainment (0.46)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Provable trade-offs between private & robust machine learning

Hayes, Jamie

arXiv.org Machine LearningJun-8-2020

Historically, machine learning methods have not been designed with security in mind. In turn, this has given rise to adversarial examples, carefully perturbed input samples aimed to mislead detection at test time, which have been applied to attack spam and malware classification, and more recently to attack image classification. Consequently, an abundance of research has been devoted to designing machine learning methods that are robust to adversarial examples. Unfortunately, there are desiderata besides robustness that a secure and safe machine learning model must satisfy, such as fairness and privacy. Recent work by Song et al. (2019) has shown, empirically, that there exists a trade-off between robust and private machine learning models. Models designed to be robust to adversarial examples often overfit on training data to a larger extent than standard (non-robust) models. If a dataset contains private information, then any statistical test that separates training and test data by observing a model's outputs can represent a privacy breach, and if a model overfits on training data, these statistical tests become easier. In this work, we identify settings where standard models will provably overfit to a larger extent in comparison to robust models, and as empirically observed in previous works, settings where the opposite behavior occurs. Thus, it is not necessarily the case that privacy must be sacrificed to achieve robustness. The degree of overfitting naturally depends on the amount of data available for training. We go on to formally characterize how the training set size factors into the privacy risks exposed by training a robust model. Finally, we empirically show our findings hold on image classification benchmark datasets, such as CIFAR-10.

artificial intelligence, loss gap, machine learning, (16 more...)

arXiv.org Machine Learning

2006.04622

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback