AITopics | condition 3

From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning

Neural Information Processing SystemsJun-23-2026, 03:22:58 GMT

Weak-to-strong generalization refers to the phenomenon where a stronger model trained under supervision from a weaker one can outperform its teacher. While prior studies aim to explain this effect, most theoretical insights are limited to abstract frameworks or linear/random feature models. In this paper, we provide a formal analysis of weak-to-strong generalization from a linear CNN (weak) to a two-layer ReLUCNN (strong). We consider structured data composed of labeldependent signals of varying difficulty and label-independent noise, and analyze gradient descent dynamics when the strong model is trained on data labeled by the pretrained weak model. Our analysis identifies two regimes--data-scarce and data-abundant--based on the signal-to-noise characteristics of the dataset, and reveals distinct mechanisms of weak-to-strong generalization. In the datascarce regime, generalization occurs via benign overfitting or fails via harmful overfitting, depending on the amount of data, and we characterize the transition boundary. In the data-abundant regime, generalization emerges in the early phase through label correction, but we observe that overtraining can subsequently degrade performance.

artificial intelligence, inequality follow, machine learning, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Education (0.45)
Information Technology (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

e4d3fe32495088805bbbb4f1de63e947-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 02:42:11 GMT

artificial intelligence, inequality, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.27)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Active Learning from Imperfect Labelers

Songbai Yan, Kamalika Chaudhuri, Tara Javidi

Neural Information Processing SystemsApr-22-2026, 09:44:11 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.32)

Add feedback

Feature-distributed sparse regression: a screen-and-clean approach

Jiyan Yang, Michael W. Mahoney, Michael Saunders, Yuekai Sun

Neural Information Processing SystemsMar-23-2026, 06:19:49 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, communication, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e4d3fe32495088805bbbb4f1de63e947-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 16:08:43 GMT

artificial intelligence, inequality, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.27)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convex Elicitation of Continuous Properties

Jessica Finocchiaro, Rafael Frongillo

Neural Information Processing SystemsFeb-15-2026, 03:00:30 GMT

Neural Information Processing Systems http://nips.cc/

convex elicitable, identification function, loss function, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Colorado > Boulder County > Boulder (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Banking & Finance (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

f0e91b1314fa5eabf1d7ef6d1561ecec-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 19:33:16 GMT

algorithm, learnability, ood detection, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Performative Learning Theory

Rodemann, Julian, Fischer-Abaigar, Unai, Bailie, James, Muandet, Krikamol

arXiv.org Machine LearningFeb-9-2026

Performative predictions influence the very outcomes they aim to forecast. We study performative predictions that affect a sample (e.g., only existing users of an app) and/or the whole population (e.g., all potential app users). This raises the question of how well models generalize under performativity. For example, how well can we draw insights about new app users based on existing users when both of them react to the app's predictions? We address this question by embedding performative predictions into statistical learning theory. We prove generalization bounds under performative effects on the sample, on the population, and on both. A key intuition behind our proofs is that in the worst case, the population negates predictions, while the sample deceptively fulfills them. We cast such self-negating and self-fulfilling predictions as min-max and min-min risk functionals in Wasserstein space, respectively. Our analysis reveals a fundamental trade-off between performatively changing the world and learning from it: the more a model affects data, the less it can learn from it. Moreover, our analysis results in a surprising insight on how to improve generalization guarantees by retraining on performatively distorted samples. We illustrate our bounds in a case study on prediction-informed assignments of unemployed German residents to job trainings, drawing upon administrative labor market records from 1975 to 2017 in Germany.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Machine Learning

2602.04402

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(8 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry:

Banking & Finance > Economy (0.48)
Education > Educational Setting (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

5c3a3b139a11689e0bc55abd95e20e39-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 21:03:06 GMT

artificial intelligence, bayesian inference, machine learning, (18 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Nagasaki Prefecture > Nagasaki (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Entropic Mirror Monte Carlo

Cherradi, Anas, Janati, Yazid, Durmus, Alain, Corff, Sylvain Le, Petetin, Yohan, Stoehr, Julien

arXiv.org Machine LearningFeb-4-2026

Importance sampling is a Monte Carlo method which designs estimators of expectations under a target distribution using weighted samples from a proposal distribution. When the target distribution is complex, such as multimodal distributions in highdimensional spaces, the efficiency of importance sampling critically depends on the choice of the proposal distribution. In this paper, we propose a novel adaptive scheme for the construction of efficient proposal distributions. Our algorithm promotes efficient exploration of the target distribution by combining global sampling mechanisms with a delayed weighting procedure. The proposed weighting mechanism plays a key role by enabling rapid resampling in regions where the proposal distribution is poorly adapted to the target. Our sampling algorithm is shown to be geometrically convergent under mild assumptions and is illustrated through various numerical experiments.

artificial intelligence, machine learning, target distribution, (14 more...)

arXiv.org Machine Learning

2602.03165

Country: Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

Filters

Collaborating Authors

condition 3

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

From Linear to Nonlinear: Provable Weak-to-Strong Generalization through Feature Learning

e4d3fe32495088805bbbb4f1de63e947-Paper-Conference.pdf

Active Learning from Imperfect Labelers

Feature-distributed sparse regression: a screen-and-clean approach

e4d3fe32495088805bbbb4f1de63e947-Paper-Conference.pdf

Convex Elicitation of Continuous Properties

f0e91b1314fa5eabf1d7ef6d1561ecec-Supplemental-Conference.pdf

Performative Learning Theory

5c3a3b139a11689e0bc55abd95e20e39-Paper.pdf

Entropic Mirror Monte Carlo