AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Learning Latent Variable Models via Jarzynski-adjusted Langevin Algorithm

Neural Information Processing SystemsJun-20-2026, 09:21:27 GMT

We utilise a sampler originating from nonequilibrium statistical mechanics, termed here Jarzynski-adjusted Langevin algorithm (JALA), to build statistical estimation methods in latent variable models. We achieve this by leveraging Jarzynski's equality and developing algorithms based on a weighted version of the unadjusted Langevin algorithm (ULA) with recursively updated weights. Adapting this for latent variable models, we develop a sequential Monte Carlo (SMC) method that provides the maximum marginal likelihood estimate of the parameters, termed JALA-EM. Under suitable regularity assumptions on the marginal likelihood, we provide a nonasymptotic analysis of the JALA-EM scheme implemented with stochastic gradient descent and show that it provably converges to the maximum marginal likelihood estimate. We demonstrate the performance of JALA-EM on a variety of latent variable models and show that it performs comparably to existing methods in terms of accuracy and computational efficiency. Importantly, the ability to recursively estimate marginal likelihoods--an uncommon feature among scalable methods--makes our approach particularly suited for model selection, which we validate through dedicated experiments.

artificial intelligence, bayesian inference, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Europe (0.45)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.68)

Add feedback

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Neural Information Processing SystemsJun-20-2026, 07:41:35 GMT

We propose ReinFlow, a simple yet effective online reinforcement learning (RL) framework that fine-tunes a family of flow matching policies for continuous robotic control. Derived from rigorous RL theory, ReinFlow injects learnable noise into a flow policy's deterministic path, converting the flow into a discrete-time Markov Process for exact and straightforward likelihood computation. This conversion facilitates exploration and ensures training stability, enabling ReinFlow to fine-tune diverse flow model variants stably, including Rectified Flow [34] and Shortcut Models [18], particularly at very few or even one denoising step.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

Asia (0.67)
North America > United States (0.67)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

EDELINE: Enhancing Memory in Diffusion-based World Models via Linear-Time Sequence Modeling

Neural Information Processing SystemsJun-20-2026, 05:37:07 GMT

World models represent a promising approach for training reinforcement learning agents with significantly improved sample efficiency. While most world model methods primarily rely on sequences of discrete latent variables to model environment dynamics, this compression often neglects critical visual details essential for reinforcement learning. Recent diffusion-based world models condition generation on a fixed context length of frames to predict the next observation, using separate recurrent neural networks to model rewards and termination signals. Although this architecture effectively enhances visual fidelity, the fixed context length approach inherently limits memory capacity. In this paper, we introduce EDELINE, a unified world model architecture that integrates state space models with diffusion models. Our approach outperforms existing baselines across visually challenging Atari 100k tasks, memory-demanding Crafter benchmark, and 3D first-person ViZDoom environments, demonstrating superior performance in all these diverse challenges.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Promising Solution (0.65)

Industry:

Information Technology (1.00)
Health & Medicine (0.82)
Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Non-monotone Submodular Optimization: p-Matchoid Constraints and Fully Dynamic Setting

Neural Information Processing SystemsJun-20-2026, 00:23:15 GMT

Submodular maximization subject to a p-matchoid constraint has various applications in machine learning, particularly in tasks such as feature selection, video and text summarization, movie recommendation, graph-based learning, and constraintbased optimization. We study this problem in the dynamic setting, where a sequence of insertions and deletions of elements to a p-matchoid M(V,I) occurs over time and the goal is to efficiently maintain an approximate solution. We propose a dynamic algorithm for non-monotone submodular maximization under a p-matchoid constraint. For a p-matchoid M(V,I) of rank k, defined by a collection of m matroids, our algorithm guarantees a (2p +2 p p(p +1) +1 +ϵ)-approximate solution at any time t in the update sequence, with an expected amortized query complexity of O(ϵ 3 pk4 log2(k)) per update.

information retrieval, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Maryland (0.28)
North America > United States > California > Los Angeles County (0.28)
North America > Canada > British Columbia (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.48)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.34)

Add feedback

Cross-fluctuation phase transitions reveal sampling dynamics in diffusion models

Neural Information Processing SystemsJun-20-2026, 00:07:14 GMT

We analyse how the sampling dynamics of distributions evolve in score-based diffusion models using cross-fluctuations, a centered-moment statistic from statistical physics. Specifically, we show that starting from an unbiased isotropic normal distribution, samples undergo sharp, discrete transitions, eventually forming distinct events of a desired distribution while progressively revealing finer structure. As this process is reversible, these transitions also occur in reverse, where intermediate states progressively merge, tracing a path back to the initial distribution. We demonstrate that these transitions can be detected as discontinuities in nth-order cross-fluctuations. For variance-preserving SDEs, we derive a closed-form for these cross-fluctuations that is efficiently computable for the reverse trajectory. We find that detecting these transitions directly boosts sampling efficiency, accelerates class-conditional and rare-class generation, and improves two zero-shot tasks-image classification and style transfer-without expensive grid search or retraining. We also show that this viewpoint unifies classical coupling and mixing from finite Markov chains with continuous dynamics while extending to stochastic SDEs and non Markovian samplers.

diffusion model, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Banking & Finance (0.67)
Law (0.67)
Media (0.45)
Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Reinforcement Learning with Imperfect Transition Predictions: ABellman-Jensen Approach

Neural Information Processing SystemsJun-19-2026, 21:53:05 GMT

Traditional reinforcement learning (RL) assumes the agents make decisions based on Markov decision processes (MDPs) with one-step transition models. In many real-world applications, such as energy management and stock investment, agents can access multi-step predictions of future states, which provide additional advantages for decision making. However, multi-step predictions are inherently high-dimensional: naively embedding these predictions into an MDP leads to an exponential blow-up in state space and the curse of dimensionality. Moreover, existing RL theory provides few tools to analyze prediction-augmented MDPs, as it typically works on one-step transition kernels and cannot accommodate multi-step predictions with errors or partial action-coverage. We address these challenges with three key innovations: First, we propose the Bayesian value function to characterize the optimal prediction-aware policy tractably. Second, we develop a novel BellmanJensen Gap analysis on the Bayesian value function, which enables characterizing the value of imperfect predictions. Third, we introduce BOLA (Bayesian Offline Learning with Online Adaptation), a two-stage model-based RL algorithm that separates offline Bayesian value learning from lightweight online adaptation to real-time predictions. We prove that BOLA remains sample-efficient even under imperfect predictions.

machine learning, prediction, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States (0.45)
Asia > China (0.28)
Europe > United Kingdom (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Energy > Power Industry (1.00)
Banking & Finance > Trading (1.00)
Energy > Renewable > Wind (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.66)

Add feedback

Epistemic Uncertainty for Generated Image Detection

Neural Information Processing SystemsJun-19-2026, 20:31:45 GMT

We introduce a novel framework for AI-generated image detection through epistemic uncertainty, aiming to address critical security concerns in the era of generative models. Our key insight stems from the observation that distributional discrepancies between training and testing data manifest distinctively in the epistemic uncertainty space of machine learning models. In this context, the distribution shift between natural and generated images leads to elevated epistemic uncertainty in models trained on natural images when evaluating generated ones. Hence, we exploit this phenomenon by using epistemic uncertainty as a proxy for detecting generated images. This converts the challenge of generated image detection into the problem of uncertainty estimation, underscoring the generalization performance of the model used for uncertainty estimation. Fortunately, advanced large-scale vision models pre-trained on extensive natural images have shown excellent generalization performance for various scenarios. Thus, we utilize these pre-trained models to estimate the epistemic uncertainty of images and flag those with high uncertainty as generated. Extensive experiments demonstrate the efficacy of our method. Code is available at https://github.com/tmlr-group/WePe.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

AMarkov Decision Process for Variable Selection in Branch & Bound

Neural Information Processing SystemsJun-19-2026, 20:18:09 GMT

Mixed-Integer Linear Programming (MILP) is a powerful framework used to address a wide range of NP-hard combinatorial optimization problems, often solved by Branch and bound (B&B). A key factor influencing the performance of B&B solvers is the variable selection heuristic governing branching decisions. Recent contributions have sought to adapt reinforcement learning (RL) algorithms to the B&B setting to learn optimal branching policies, through Markov Decision Processes (MDP) inspired formulations, and ad hoc convergence theorems and algorithms. In this work, we introduce BBMDP, a principled vanilla MDP formulation for variable selection in B&B, allowing to leverage a broad range of RL algorithms for the purpose of learning optimal B&B heuristics. Computational experiments validate our model empirically, as our branching agent outperforms prior state-of-the-art RL agents on four standard MILP benchmarks.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country: Europe (0.46)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Structural Causal Bandits under Markov Equivalence

Neural Information Processing SystemsJun-19-2026, 16:47:51 GMT

In decision-making processes, an intelligent agent with causal knowledge can optimize action spaces to avoid unnecessary exploration. A structural causal bandit framework provides guidance on how to prune actions that are unable to maximize reward by leveraging prior knowledge of the underlying causal structure among actions. A key assumption of this framework is that the agent has access to a fully-specified causal diagram representing the target system. In this paper, we extend the structural causal bandits to scenarios where the agent leverages a Markov equivalence class. In such cases, the causal structure is provided to the agent in the form of a partial ancestral graph (PAG). We propose a generalized framework for identifying potentially optimal actions within this graph structure, thereby broadening the applicability of structural causal bandits.

artificial intelligence, causal diagram, machine learning, (15 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Education (0.45)

Technology:

Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)

Add feedback

Data Mixture Optimization: AMulti-fidelity Multi-scale Bayesian Framework

Neural Information Processing SystemsJun-19-2026, 16:35:09 GMT

Careful curation of data sources can significantly improve the performance of LLM pre-training, but predominant approaches rely heavily on intuition or costly trial-and-error, making them difficult to generalize across different data domains and downstream tasks. Although scaling laws can provide a principled and general approach for data curation, standard deterministic extrapolation from small-scale experiments to larger scales requires strong assumptions on the reliability of such extrapolation, whose brittleness has been highlighted in prior works. In this paper, we introduce a probabilistic extrapolation framework for data mixture optimization that avoids rigid assumptions and explicitly models the uncertainty in performance across decision variables. We formulate data curation as a sequential decisionmaking problem--multi-fidelity, multi-scale Bayesian optimization--where {data mixtures, model scale, training steps}are adaptively selected to balance training cost and potential information gain. Our framework naturally gives rise to algorithm prototypes that leverage noisy information from inexpensive experiments to systematically inform costly training decisions. To accelerate methodological progress, we build a simulator based on 472 language model pre-training runs with varying data compositions from the SlimPajama dataset. We observe that even simple kernels and acquisition functions can enable principled decisions across training models from 20M to 1B parameters and achieve 2.6x and 3.3x speedups compared to multi-fidelity Bayesian optimization and random search baselines. Taken together, our framework underscores potential efficiency gains achievable by developing principled and transferable data mixture optimization methods.

data mixture, large language model, machine learning, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)
Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
(2 more...)

Add feedback