AITopics | maximum entropy inverse reinforcement learning

Collaborating Authors

maximum entropy inverse reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

Neural Information Processing SystemsDec-24-2025, 16:00:14 GMT

We present a maximum entropy inverse reinforcement learning (IRL) approach for improving the sample quality of diffusion generative models, especially when the number of generation time steps is small. Similar to how IRL trains a policy based on the reward function learned from expert demonstrations, we train (or fine-tune) a diffusion model using the log probability density estimated from training data. Since we employ an energy-based model (EBM) to represent the log density, our approach boils down to the joint training of a diffusion model and an EBM. Our IRL formulation, named Diffusion by Maximum Entropy IRL (DxMI), is a minimax problem that reaches equilibrium when both models converge to the data distribution. The entropy maximization plays a key role in DxMI, facilitating the exploration of the diffusion model and ensuring the convergence of the EBM. We also propose Diffusion by Dynamic Programming (DxDP), a novel reinforcement learning algorithm for diffusion models, as a subroutine in DxMI. DxDP makes the diffusion model update in DxMI efficient by transforming the original problem into an optimal control formulation where value functions replace back-propagation in time. Our empirical studies show that diffusion models fine-tuned using DxMI can generate high-quality samples in as few as 4 and 10 steps. Additionally, DxMI enables the training of an EBM without MCMC, stabilizing EBM training dynamics and enhancing anomaly detection performance.

artificial intelligence, machine learning, maximum entropy inverse reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.88)

Add feedback

Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

Neural Information Processing SystemsMay-26-2025, 19:52:54 GMT

diffusion model, machine learning, reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.88)

Add feedback

Efficient Sampling-Based Maximum Entropy Inverse Reinforcement Learning with Application to Autonomous Driving

Wu, Zheng, Sun, Liting, Zhan, Wei, Yang, Chenyu, Tomizuka, Masayoshi

arXiv.org Artificial IntelligenceJun-21-2020

In the past decades, we have witnessed significant progress in the domain of autonomous driving. Advanced techniques based on optimization and reinforcement learning (RL) become increasingly powerful at solving the forward problem: given designed reward/cost functions, how should we optimize them and obtain driving policies that interact with the environment safely and efficiently. Such progress has raised another equally important question: \emph{what should we optimize}? Instead of manually specifying the reward functions, it is desired that we can extract what human drivers try to optimize from real traffic data and assign that to autonomous vehicles to enable more naturalistic and transparent interaction between humans and intelligent agents. To address this issue, we present an efficient sampling-based maximum-entropy inverse reinforcement learning (IRL) algorithm in this paper. Different from existing IRL algorithms, by introducing an efficient continuous-domain trajectory sampler, the proposed algorithm can directly learn the reward functions in the continuous domain while considering the uncertainties in demonstrated trajectories from human drivers. We evaluate the proposed algorithm on real driving data, including both non-interactive and interactive scenarios. The experimental results show that the proposed algorithm achieves more accurate prediction performance with faster convergence speed and better generalization compared to other baseline IRL algorithms.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2006.13704

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.61)

Add feedback

Learning from humans: what is inverse reinforcement learning?

#artificialintelligenceApr-20-2020, 00:32:30 GMT

One of the goals of AI research is to teach machines how to do the same things people do, but better. In the early 2000s, this meant focusing on problems like flying helicopters and walking up flights of stairs. However, there's still a massive list of problems where humans outperform machines. Although we can no longer claim to beat machines at tasks like Go and image classification, we have a distinct advantage in solving problems that aren't as well-defined, like judging a well-executed backflip, cleaning a room while preventing accidents, and perhaps the most human problem of all: reasoning about people's values. Since all these tasks contain some degree of subjectivity, machines need information about the world as well as a way to learn about the people within it in order to solve these problems.

inverse reinforcement learning, reinforcement learning, reward function, (11 more...)

#artificialintelligence

Country: Asia > Middle East > Jordan (0.05)

Industry: Transportation > Air (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Multi-task Maximum Entropy Inverse Reinforcement Learning

Gleave, Adam, Habryka, Oliver

arXiv.org Artificial IntelligenceMay-22-2018

Multi-task Inverse Reinforcement Learning (IRL) is the problem of inferring multiple reward functions from expert demonstrations. Prior work, built on Bayesian IRL, is unable to scale to complex environments due to computational constraints. This paper contributes the first formulation of multi-task IRL in the more computationally efficient Maximum Causal Entropy (MCE) IRL framework. Experiments show our approach can perform one-shot imitation learning in a gridworld environment that single-task IRL algorithms require hundreds of demonstrations to solve. Furthermore, we outline how our formulation can be applied to state-of-the-art MCE IRL algorithms such as Guided Cost Learning. This extension, based on meta-learning, could enable multi-task IRL to be performed for the first time in high-dimensional, continuous state MDPs with unknown dynamics as commonly arise in robotics.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

1805.08882

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.42)

Add feedback