AITopics | Learning Graphical Models

Collaborating Authors

Learning Graphical Models

A graphical model or probabilistic graphical model (PGM) or structured probabilistic model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables. They are commonly used in probability theory, statistics—particularly Bayesian statistics—and machine learning. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Automatic Data Augmentation for Generalization in Reinforcement Learning

Neural Information Processing SystemsApr-25-2026, 06:15:56 GMT

Deep reinforcement learning (RL) agents often fail to generalize beyond their training environments. To alleviate this problem, recent work has proposed the use of data augmentation. However, different tasks tend to benefit from different types of augmentations and selecting the right one typically requires expert knowledge. In this paper, we introduce three approaches for automatically finding an effective augmentation for any RL task. These are combined with two novel regularization terms for the policy and value function, required to make the use of data augmentation theoretically sound for actor-critic algorithms. Our method achieves a new state-of-the-art1on the Procgen benchmark and outperforms popular RL algorithms on DeepMind Control tasks with distractors. In addition, our agent learns policies and representations which are more robust to changes in the environment that are irrelevant for solving the task, such as the background.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Equilibrium and non-Equilibrium regimes in the learning of Restricted Boltzmann Machines

Neural Information Processing SystemsApr-25-2026, 06:02:54 GMT

Training Restricted Boltzmann Machines (RBMs) has been challenging for a long time due to the difficulty of computing precisely the log-likelihood gradient. Over the past decades, many works have proposed more or less successful training recipes but without studying the crucial quantity of the problem: the mixing time, i.e. the number of Monte Carlo iterations needed to sample new configurations from a model. In this work, we show that this mixing time plays a crucial role in the dynamics and stability of the trained model, and that RBMs operate in two well-defined regimes, namely equilibrium and out-of-equilibrium, depending on the interplay between this mixing time of the model and the number of steps, k, used to approximate the gradient. We further show empirically that this mixing time increases with the learning, which often implies a transition from one regime to another as soon as kbecomes smaller than this time. In particular, we show that using the popular k (persistent) contrastive divergence approaches, with k small, the dynamics of the learned model are extremely slow and often dominated by strong out-of-equilibrium effects. On the contrary, RBMs trained in equilibrium display faster dynamics, and a smooth convergence to dataset-like configurations during the sampling. Finally we discuss how to exploit in practice both regimes depending on the task one aims to fulfill: (i) short k can be used to generate convincing samples in short learning times, (ii) large k (or increasingly large) is needed to learn the correct equilibrium distribution of the RBM. Finally, the existence of these two operational regimes seems to be a general property of energy based models trained via likelihood maximization.

artificial intelligence, machine learning, rbm, (18 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

29e4b51d45dc8f534260adc45b587363-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 05:30:15 GMT

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

285a25c17f351708754cdb6d56f3962e-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 04:58:30 GMT

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(3 more...)

Add feedback

28f699175783a2c828ae74d53dd3da20-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 04:57:36 GMT

Recent years have seen embodied visual navigation advance in two distinct directions: (i) in equipping the AI agent to follow natural language instructions, and (ii) in making the navigable world multimodal, e.g., audio-visual navigation. However, the real world is not only multimodal, but also often complex, and thus in spite of these advances, agents still need to understand the uncertainty in their actions and seek instructions to navigate.

artificial intelligence, machine learning, natural language, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Surprise Minimizing Multi-Agent Learning with Energy-based Models

Neural Information Processing SystemsApr-25-2026, 04:40:13 GMT

Multi-Agent Reinforcement Learning (MARL) has demonstrated significant success by virtue of collaboration across agents. Recent work, on the other hand, introduces surprise which quantifies the degree of change in an agent's environment. Surprise-based learning has received significant attention in the case of single-agent entropic settings but remains an open problem for fast-paced dynamics in multi-agent scenarios. A potential alternative to address surprise may be realized through the lens of free-energy minimization. We explore surprise minimization in multi-agent learning by utilizing the free energy across all agents in a multi-agent system. A temporal Energy-Based Model (EBM) represents an estimate of surprise which is minimized over the joint agent distribution. Our formulation of the EBM is theoretically akin to the minimum conjugate entropy objective and highlights suitable convergence towards minimum surprising states.

agent, artificial intelligence, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology: