AITopics | proposition4

Collaborating Authors

proposition4

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Knowing When to Quit: A Principled Framework for Dynamic Abstention in LLM Reasoning

Davidov, Hen, Cohen, Nachshon, Kalinsky, Oren, Fairstein, Yaron, Kushilevitz, Guy, Yazdi, Ram, Rebeschini, Patrick

arXiv.org Machine LearningApr-21-2026

Large language models (LLMs) using chain-of-thought reasoning often waste substantial compute by producing long, incorrect responses. Abstention can mitigate this by withholding outputs unlikely to be correct. While most abstention methods decide to withhold outputs before or after generation, dynamic mid-generation abstention considers early termination of unpromising reasoning traces at each token position. Prior work has explored empirical variants of this idea, but principled guidance for the abstention rule remains lacking. We present a formal analysis of dynamic abstention for LLMs, modeling abstention as an explicit action within a regularized reinforcement learning framework. An abstention reward parameter controls the trade-off between compute and information. We show that abstaining when the value function falls below this reward strictly outperforms natural baselines under general conditions. We further derive a principled and efficient method to approximate the value function. Empirical results on mathematical reasoning and toxicity avoidance tasks support our theory and demonstrate improved selective accuracy over existing methods.

abstention, large language model, machine learning, (20 more...)

arXiv.org Machine Learning

2604.18419

Country:

Europe > Monaco (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Quantification of Credal Uncertainty: A Distance-Based Approach

Gonzalez-Garcia, Xabier, Chau, Siu Lun, Rodemann, Julian, Caprio, Michele, Muandet, Krikamol, Bustince, Humberto, Destercke, Sébastien, Hüllermeier, Eyke, Sale, Yusuf

arXiv.org Machine LearningMar-31-2026

Credal sets, i.e., closed convex sets of probability measures, provide a natural framework to represent aleatoric and epistemic uncertainty in machine learning. Yet how to quantify these two types of uncertainty for a given credal set, particularly in multiclass classification, remains underexplored. In this paper, we propose a distance-based approach to quantify total, aleatoric, and epistemic uncertainty for credal sets. Concretely, we introduce a family of such measures within the framework of Integral Probability Metrics (IPMs). The resulting quantities admit clear semantic interpretations, satisfy natural theoretical desiderata, and remain computationally tractable for common choices of IPMs. We instantiate the framework with the total variation distance and obtain simple, efficient uncertainty measures for multiclass classification. In the binary case, this choice recovers established uncertainty measures, for which a principled multiclass generalization has so far been missing. Empirical results confirm practical usefulness, with favorable performance at low computational cost.

artificial intelligence, credal, machine learning, (16 more...)

arXiv.org Machine Learning

2603.2727

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States > Wisconsin (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.83)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

A theory of learning data statistics in diffusion models, from easy to hard

Bardone, Lorenzo, Merger, Claudia, Goldt, Sebastian

arXiv.org Machine LearningMar-16-2026

While diffusion models have emerged as a powerful class of generative models, their learning dynamics remain poorly understood. We address this issue first by empirically showing that standard diffusion models trained on natural images exhibit a distributional simplicity bias, learning simple, pair-wise input statistics before specializing to higher-order correlations. We reproduce this behaviour in simple denoisers trained on a minimal data model, the mixed cumulant model, where we precisely control both pair-wise and higher-order correlations of the inputs. We identify a scalar invariant of the model that governs the sample complexity of learning pair-wise and higher-order correlations that we call the diffusion information exponent, in analogy to related invariants in different learning paradigms. Using this invariant, we prove that the denoiser learns simple, pair-wise statistics of the inputs at linear sample complexity, while more complex higher-order statistics, such as the fourth cumulant, require at least cubic sample complexity. We also prove that the sample complexity of learning the fourth cumulant is linear if pair-wise and higher-order statistics share a correlated latent structure. Our work describes a key mechanism for how diffusion models can learn distributions of increasing complexity.

artificial intelligence, cit, machine learning, (18 more...)

arXiv.org Machine Learning

2603.12901

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

The committee machine: Computational to statistical gaps in learning a two-layers neural network

Benjamin Aubin, Antoine Maillard, jean barbier, Florent Krzakala, Nicolas Macris, Lenka Zdeborová

Neural Information Processing SystemsFeb-13-2026, 13:15:26 GMT

Heuristic tools from statistical physics have been used in the past to locate the phase transitions and compute the optimal learning and generalization errors in the teacher-student scenario in multi-layer neural networks. In this contribution, we provide a rigorous justification of these approaches for a two-layers neural network model called the committee machine. We also introduce a version of the approximate message passing (AMP) algorithm for the committee machine that allows to perform optimal learning in polynomial time for a large set of parameters.

artificial intelligence, machine learning, pout, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Italy > Friuli Venezia Giulia > Trieste Province > Trieste (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks

Grant Rotskoff, Eric Vanden-Eijnden

Neural Information Processing SystemsFeb-12-2026, 08:33:52 GMT

Theperformance ofneural networksonhigh-dimensional datadistributions suggests that it may be possible to parameterize a representation of agiven highdimensional function with controllably small errors, potentially outperforming standard interpolation methods. We demonstrate, both theoretically and numerically, that this is indeed the case. We map the parameters of a neural network to a system of particles relaxing with an interaction potential determined by the lossfunction.

artificial intelligence, arxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.05)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

e614f646836aaed9f89ce58e837e2310-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 15:58:04 GMT

The effect of diverse generation becomes apparent in the multi-round setting.

artificial intelligence, gflownet, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.47)

Add feedback

AUnifiedAnalysisofFederatedLearningwith ArbitraryClientParticipation

Neural Information Processing SystemsFeb-10-2026

The objective(1) can be extended to a weighted average, but we do not write out the weights and consider them as part ofℓn(x,ξ)andFn(x).

artificial intelligence, machine learning, pnn, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Utah (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

2063a00c435aafbcc58c16ce1e522139-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 19:31:05 GMT

Amongst those functions, the simplest are single-index modelsf(x) = ϕ(x θ), where the labels are generated by an arbitrary non-linear scalar link functionϕ applied to an unknown one-dimensional projectionθ of the input data.

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country: Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.05)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)

Add feedback

45d74e190008c7bff2845ffc8e3facd3-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 16:04:47 GMT

In a typical supervised learning task, one is given a training dataset ofn N labeled samplesD = ((xi,yi) Rd R)i [n], and a parametric model withm N parameters, f:Rm Rd R. The task istofind parameters fitting the training data, i.e. findθ Rm such that i [n],f(θ;xi) yi.

artificial intelligence, machine learning, sinc, (19 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

OnUniformConvergence andLow-NormInterpolationLearning

Neural Information Processing SystemsFeb-8-2026, 08:56:06 GMT

Butweargue we can explain the consistencyof the minimal-norm interpolator with aslightly weaker, yet standard, notion: uniform convergenceof zero-error predictorsin a normball.

artificial intelligence, machine learning, wmn, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback