AITopics | Country

In modern parametric model training, full-batch gradient descent (and its variants) suffers due to progressively stronger biasing towards the exact realization of training data; this drives the systematic ``generalization gap'', where the train error becomes an unreliable proxy for test error. Existing approaches either argue this gap is benign through complex analysis or sacrifice data to a validation set. In contrast, we introduce decoupled descent (DD), a novel theory-based training algorithm that satisfies a train-test identity -- enforcing the train error to asymptotically track the test error for stylized Gaussian mixture models. Within this specific regime, leveraging approximate message passing theory, DD iteratively cancels the biases due to data reuse, rigorously demonstrating the feasibility of zero-cost validation and $100\%$ data utilization. Moreover, DD is governed by a low-dimensional state evolution recursion, rendering the dynamics of the algorithm transparent and tractable. We validate DD on XOR classification, yielding superior performance compared to GD; additionally, we implement noisy MNIST and non-linear probing of CIFAR-10, demonstrating that even when our stylized assumptions are relaxed, DD narrows the generalization gap compared to GD.

artificial intelligence, assumption, machine learning, (18 more...)

arXiv.org Machine Learning

2604.27883

Country: North America > Canada (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Prediction-powered Inference by Mixture of Experts

Gu, Yanwu, Kong, Linglong, Xia, Dong

arXiv.org Machine LearningMay-1-2026

The rapidly expanding artificial intelligence (AI) industry has produced diverse yet powerful prediction tools, each with its own network architecture, training strategy, data-processing pipeline, and domain-specific strengths. These tools create new opportunities for semi-supervised inference, in which labeled data are limited and expensive to obtain, whereas unlabeled data are abundant and widely available. Given a collection of predictors, we treat them as a mixture of experts (MOE) and introduce an MOE-powered semi-supervised inference framework built upon prediction-powered inference (PPI). Motivated by the variance reduction principle underlying PPI, the proposed framework seeks the mixture of experts that achieves the smallest possible variance. Compared with standard PPI, the MOE-powered inference framework adapts to the unknown performance of individual predictors, benefits from their collective predictive power, and enjoys a best-expert guarantee. The framework is flexible and applies to mean estimation, linear regression, quantile estimation, and general M-estimation. We develop non-asymptotic theory for the MOE-powered inference framework and establish upper bounds on the coverage error of the resulting confidence intervals. Numerical experiments demonstrate the practical effectiveness of MOE-powered inference and corroborate our theoretical findings.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

2604.27892

Country:

North America (0.45)
Asia (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)

Add feedback

Testing for Differences in Gaussian Graphical Models: Applications to Brain Connectivity

Eugene Belilovsky, Gaël Varoquaux, Matthew B. Blaschko

Neural Information Processing SystemsApr-30-2026, 23:09:41 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, lasso, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report > Experimental Study (0.95)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.96)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Add feedback

DeepMath - Deep Sequence Models for Premise Selection

Geoffrey Irving, Christian Szegedy, Alexander A. Alemi, Niklas Een, Francois Chollet, Josef Urban

Neural Information Processing SystemsApr-30-2026, 23:08:19 GMT

We study the effectiveness of neural sequence models for premise selection in automated theorem proving, one of the main bottlenecks in the formalization of mathematics. We propose a two stage approach for this task that yields good results for the premise selection task on the Mizar corpus while avoiding the handengineered features of existing state-of-the-art models. To our knowledge, this is the first time deep learning has been applied to theorem proving on a large scale.

conjecture, logic & formal reasoning, machine learning, (19 more...)

Neural Information Processing Systems

Country: Europe (0.68)

Genre:

Instructional Material (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Good Luck Getting a Mac Mini for the Next 'Several Months'

WIREDApr-30-2026, 23:00:51 GMT

Apple CEO Tim Cook told analysts that AI adoption has happened faster than expected. Apple CEO Tim Cook said on the company's earnings call on Thursday that it could take "several months" to meet skyrocketing demand for the Mac Mini, the company's compact but mighty, screen-free desktop computer. Cook's remarks come after coders determined in recent months that the Mac Mini was the perfect machine for agentic AI tasks. "On the Mac Mini and Mac Studio, both of these are amazing platforms for AI and agentic tools," Cook said on the earnings call, in response to analyst questions. "And customer adoption of that is happening faster than we expected." The news comes amid another record-setting quarter for the company.

artificial intelligence, main content security politics, natural language, (9 more...)

WIRED

Country: North America > United States > California (0.16)

Genre: Financial News (0.76)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.30)

Add feedback

'Ant-Man' actress slams Disney for 'disgusting' Marvel layoffs

FOX NewsApr-30-2026, 23:00:18 GMT

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .

artificial intelligence, disney, social media, (9 more...)

FOX News

Country: North America > United States > California > Los Angeles County > Los Angeles (0.15)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.49)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.31)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.75)

Add feedback

e4da3b7fbbce2345d7772b0674a318d5-Paper.pdf

Neural Information Processing SystemsApr-30-2026, 22:55:46 GMT

Add feedback

High Dimensional Structured Superposition Models

Qilong Gu, Arindam Banerjee

Neural Information Processing SystemsApr-30-2026, 22:53:55 GMT

High dimensional superposition models characterize observations using parameters which can be written as a sum of multiple component parameters, each with its own structure, e.g., sum of low rank and sparse matrices, sum of sparse and rotated sparse vectors, etc. In this paper, we consider general superposition models which allow sum of any number of component parameters, and each component structure can be characterized by any norm. We present a simple estimator for such models, give a geometric condition under which the components can be accurately estimated, characterize sample complexity of the estimator, and give high probability nonasymptotic bounds on the componentwise estimation error. We use tools from empirical processes and generic chaining for the statistical analysis, and our results, which substantially generalize prior work on superposition models, are in terms of Gaussian widths of suitable sets.

artificial intelligence, machine learning, sc condition, (16 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

Regularized Nonlinear Acceleration

Damien Scieur, Alexandre d'Aspremont, Francis Bach

Neural Information Processing SystemsApr-30-2026, 22:39:10 GMT

We describe a convergence acceleration technique for generic optimization problems. Our scheme computes estimates of the optimum from a nonlinear average of the iterates produced by any optimization method. The weights in this average are computed via a simple and small linear system, whose solution can be updated online. This acceleration scheme runs in parallel to the base algorithm, providing improved estimates of the solution on the fly, while the original optimization method is running. Numerical experiments are detailed on classical classification problems.

algorithm, artificial intelligence, optimization problem, (16 more...)

Neural Information Processing Systems

Country: Europe > France (0.15)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback