AITopics | imitation

We establish novel structural and statistical results for entropy-regularized min-max inverse reinforcement learning (Min-Max-IRL) with linear reward classes in finite-horizon MDPs with Borel state and action spaces. On the structural side, we show that maximum likelihood estimation (MLE) and Min-Max-IRL are equivalent at the population level, and at the empirical level under deterministic dynamics. On the statistical side, exploiting pseudo-self-concordance of the Min-Max-IRL loss, we prove that both the trajectory-level KL divergence and the squared parameter error in the Hessian norm decay at the fast rate $\mathcal{O}(n^{-1})$, where $n$ is the number of expert trajectories. Our guarantees apply under misspecification and require no exploration assumptions. We further extend reward-identifiability results to general Borel spaces and derive novel results on the derivatives of the soft-optimal value function with respect to reward parameters.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Machine Learning

2605.14599

Genre: Research Report (0.64)

Add feedback

Learning non-Markovian Decision-Making from State-only Sequences

Neural Information Processing SystemsMay-1-2026, 02:02:59 GMT

Conventional imitation learning assumes access to the actions of demonstrators, but these motor signals are often non-observable in naturalistic settings.

machine learning, reinforcement learning, transition, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Compositional Plan Vectors

Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey Levine

Neural Information Processing SystemsApr-30-2026, 19:24:19 GMT

Autonomous agents situated in real-world environments must be able to master large repertoires of skills. While a single short skill can be learned quickly, it would be impractical to learn every task independently. Instead, the agent should share knowledge across behaviors such that each task can be learned efficiently, and such that the resulting model can generalize to new tasks, especially ones that are compositions or subsets of tasks seen previously. A policy conditioned on a goal or demonstration has the potential to share knowledge between tasks if it sees enough diversity of inputs. However, these methods may not generalize to a more complex task at test time. We introduce compositional plan vectors (CPVs) to enable a policy to perform compositions of tasks without additional supervision. CPVs represent trajectories as the sum of the subtasks within them. We show that CPVs can be learned within a one-shot imitation learning framework without any additional supervision or information about task hierarchy, and enable a demonstration-conditioned policy to generalize to tasks that sequence twice as many skills as the tasks seen during training. Analogously to embeddings such as word2vec in NLP, CPVs can also support simple arithmetic operations - for example, we can add the CPVs for two different tasks to command an agent to compose both tasks, without any additional training.

machine learning, reinforcement learning, trajectory, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

00989c20ff1386dc386d8124ebcba1a5-AuthorFeedback.pdf

Neural Information Processing SystemsApr-30-2026, 19:24:05 GMT

artificial intelligence, machine learning, waypoint, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

ee90fb9511b263f2ff971be9b374f9ee-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 05:47:25 GMT

arxiv preprint arxiv, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.46)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

2567c95fd41459a98a73ba893775d22a-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-26-2026, 00:07:25 GMT

machine learning, natural language, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

2567c95fd41459a98a73ba893775d22a-Paper-Conference.pdf

Neural Information Processing SystemsApr-26-2026, 00:07:22 GMT

machine learning, natural language, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

31839b036f63806cba3f47b93af8ccb5-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 09:19:44 GMT

artificial intelligence, machine learning, reinforcement learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.97)

Add feedback

Imitation with Neural Density Models

Neural Information Processing SystemsApr-25-2026, 06:03:31 GMT

We propose a new framework for Imitation Learning (IL) via density estimation of the expert's occupancy measure followed by Maximum Occupancy Entropy Reinforcement Learning (RL) using the density as a reward. Our approach maximizes a non-adversarial model-free RL objective that provably lower bounds reverse Kullback-Leibler divergence between occupancy measures of the expert and imitator. We present a practical IL algorithm, Neural Density Imitation (NDI), which obtains state-of-the-art demonstration efficiency on benchmark control tasks.

arxiv preprint arxiv, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country: North America (0.28)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

204904e461002b28511d5880e1c36a0f-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 01:33:23 GMT

Similarly to [6], we consider that all environments have the same underlying Structural Causal Model (SCM) and that the different environments correspond to different interventions on the SCM. We provide here the formal definition for SCMs and interventions. We say that Xi causes Xj if Xi 2Pa(Xj). Definition A.2. (Intervention) [6]: Consider a SCMC =( S,N). An intervention e on C consists of replacing one or several of its structural equations to obtain an intervened SCMCe =( Se,N e) with structural equations: Sej: Xej fj(Pa(Xej),N ej), for j =1,...m (11) The variable Xe is intervened on if Si 6= Sei or Ni 6= Nei .

artificial intelligence, different environment, machine learning, (17 more...)

Neural Information Processing Systems

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.91)

Add feedback

Filters

Collaborating Authors

imitation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Fast Rates for Inverse Reinforcement Learning

Learning non-Markovian Decision-Making from State-only Sequences

Compositional Plan Vectors

00989c20ff1386dc386d8124ebcba1a5-AuthorFeedback.pdf

ee90fb9511b263f2ff971be9b374f9ee-Paper-Conference.pdf

2567c95fd41459a98a73ba893775d22a-Supplemental-Conference.pdf

2567c95fd41459a98a73ba893775d22a-Paper-Conference.pdf

31839b036f63806cba3f47b93af8ccb5-Paper.pdf

Imitation with Neural Density Models

204904e461002b28511d5880e1c36a0f-Supplemental.pdf