AITopics | Uncertainty

Collaborating Authors

Uncertainty

"AI systems–like people–must often act despite partial and uncertain information. First, the information received may be unreliable (e.g., a patient may mis-remember when a disease started, or may not have noticed a symptom that is important to a diagnosis). In addition, rules connecting real-world events can never include all the factors that might determine whether their conclusions really apply (e.g., the correctness of basing a diagnosis on a lab test depends whether there were conditions that might have caused a false positive, on the test being done correctly, on the results being associated with the right patient, etc.) Thus in order to draw useful conclusions, AI systems must be able to reason about the probability of events, given their current knowledge."
– from David Leake, Reasoning Under Uncertainty

News Overviews Instructional Materials AI-Alerts Classics

15349e1c554406b7719d047a498e7117-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 18:13:10 GMT

artificial intelligence, machine learning, proceedings, (11 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.28)

Genre:

Research Report > Experimental Study (0.94)
Research Report > Strength High (0.68)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Training and Inference on Any-Order Autoregressive Models the Right Way

Neural Information Processing SystemsApr-24-2026, 16:32:17 GMT

Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting. In recent years, the family of Any-Order Autoregressive Models (AO-ARMs) - closely related to popular models such as BERT and XLNet - has shown breakthrough performance in arbitrary conditional tasks across a sweeping range of domains. But, in spite of their success, in this paper we identify significant improvements to be made to previous formulations of AO-ARMs. First, we show that AO-ARMs suffer from redundancy in their probabilistic model, i.e., they define the same distribution in multiple different ways. We alleviate this redundancy by training on a smaller set of univariate conditionals that still maintains support for efficient arbitrary conditional inference. Second, we upweight the training loss for univariate conditionals that are evaluated more frequently during inference. Our method leads to improved performance with no compromises on tractability, giving state-of-the-art likelihoods in arbitrary conditional modeling on text (Text8), image (CIFAR10, ImageNet32), and continuous tabular data domains.

ao-arm, machine learning, natural language, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Robots (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation

Neural Information Processing SystemsApr-24-2026, 15:50:13 GMT

We study the model-based reward-free reinforcement learning with linear function approximation for episodic Markov decision processes (MDPs). In this setting, the agent works in two phases. In the exploration phase, the agent interacts with the environment and collects samples without the reward. In the planning phase, the agent is given a specific reward function and uses samples collected from the exploration phase to learn a good policy. We propose a new provably efficient algorithm, called UCRL-RFE under the Linear Mixture MDP assumption, where the transition probability kernel of the MDP can be parameterized by a linear function over certain feature mappings defined on the triplet of state, action, and next state.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Los Angeles (0.29)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Maximum Likelihood Training of Score-Based Diffusion Models

Neural Information Processing SystemsApr-24-2026, 15:17:19 GMT

Score-based diffusion models synthesize samples by reversing a stochastic process that diffuses data to noise, and are trained by minimizing a weighted combination of score matching losses. The log-likelihood of score-based diffusion models can be tractably computed through a connection to continuous normalizing flows, but log-likelihood is not directly optimized by the weighted combination of score matching losses. We show that for a specific weighting scheme, the objective upper bounds the negative log-likelihood, thus enabling approximate maximum likelihood training of score-based diffusion models. We empirically observe that maximum likelihood training consistently improves the likelihood of score-based diffusion models across multiple datasets, stochastic processes, and model architectures. Our best models achieve negative log-likelihoods of 2.83 and 3.76 bits/dim on CIFAR-10 and ImageNet 32 ˆ32 without any data augmentation, on a par with state-of-the-art autoregressive models on these tasks.

artificial intelligence, likelihood, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)

Add feedback

0a9fdbb17feb6ccb7ec405cfb85222c4-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 15:17:17 GMT

artificial intelligence, likelihood, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.95)
Information Technology > Artificial Intelligence > Vision (0.93)

Add feedback

Riemannian Score-Based Generative Modelling

Neural Information Processing SystemsApr-24-2026, 15:17:06 GMT

Score-based generative models (SGMs) are a powerful class of generative models that exhibit remarkable empirical performance. Score-based generative modelling (SGM) consists of a "noising" stage, whereby a diffusion is used to gradually add Gaussian noise to data, and a generative model, which entails a "denoising" process defined by approximating the time-reversal of the diffusion. Existing SGMs assume that data is supported on a Euclidean space, i.e. a manifold with flat geometry. In many domains such as robotics, geoscience or protein modelling, data is often naturally described by distributions living on Riemannian manifolds and current SGM techniques are not appropriate. We introduce here Riemannian Score-based Generative Models (RSGMs), a class of generative models extending SGMs to Riemannian manifolds. We demonstrate our approach on a variety of manifolds, and in particular with earth and climate science spherical data.

artificial intelligence, machine learning, manifold, (13 more...)

Neural Information Processing Systems

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.93)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

On the Value of Interaction and Function Approximation in Imitation Learning

Neural Information Processing SystemsApr-24-2026, 14:56:03 GMT

We study the statistical guarantees for the Imitation Learning (IL) problem in episodic MDPs.

learner, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County (0.28)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.41)

Add feedback

Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection-Supplementary material-Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsApr-24-2026, 14:15:29 GMT

We use the same notations as in section 4.2 Denote ec as a one-hot row vector of the true label, we define the hypothesis set that genie is allowed3 to choose from as4 PΘ = pθ(y|x) = 1 2πσ2 exp 1 2σ2 y f(x>nθ) e>c We simulate the response of the pNML regret for two classes (C=2) and divide it by logC to have11 the regret bounded between 0 and 1. Figure 1 shows the regret behaviour for different p1 (the ERM12 probability assignment of class 1) as a function of x>g.13 For an ERM model that is certain on the prediction (p1 = 0.99 that is represented by the purple14 curve), a slight variation of x>g causes a large response of the regret comparing to p1 that equals15 0.55 and 0.85. Next, 20 we compute the correlation matrix of the training embeddings and perform an SVD decomposition. For the SVHN training set, most of the energy is located in the first 50 eigenvalues and then 24 there is a significant decrease of approximately 103. The same phenomenon is also seen in figure 2a 25 that shows the eigenvalues of ResNet-40 model.

artificial intelligence, imagenet, machine learning, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.40)

Add feedback

Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection

Neural Information Processing SystemsApr-24-2026, 14:15:25 GMT

Detecting out-of-distribution (OOD) samples is vital for developing machine learning based models for critical safety systems. Common approaches for OOD detection assume access to some OOD samples during training which may not be available in a real-life scenario. Instead, we utilize the predictive normalized maximum likelihood (pNML) learner, in which no assumptions are made on the tested input. We derive an explicit expression of the pNML and its generalization error, denoted as the regret, for a single layer neural network (NN). We show that this learner generalizes well when (i) the test vector resides in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, or (ii) the test sample is far from the decision boundary. Furthermore, we describe how to efficiently apply the derived pNML regret to any pretrained deep NN, by employing the explicit pNML for the last layer, followed by the softmax function. Applying the derived regret to deep NN requires neither additional tunable parameters nor extra data. We extensively evaluate our approach on 74 OOD detection benchmarks using DenseNet-100, ResNet-34, and WideResNet40 models trained with CIFAR-100, CIFAR-10, SVHN, and ImageNet-30 showing a significant improvement of up to 15.6% over recent leading methods.

artificial intelligence, machine learning, pnml regret, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Filters

Collaborating Authors

Uncertainty

15349e1c554406b7719d047a498e7117-Supplemental-Conference.pdf

0c79d6ed1788653643a1ac67b6ea32a7-Paper-Conference.pdf

Training and Inference on Any-Order Autoregressive Models the Right Way

Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation

Maximum Likelihood Training of Score-Based Diffusion Models

0a9fdbb17feb6ccb7ec405cfb85222c4-Paper.pdf

Riemannian Score-Based Generative Modelling

On the Value of Interaction and Function Approximation in Imitation Learning

Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection-Supplementary material-Anonymous Author(s) Affiliation Address email

Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection