AITopics | pieter abbeel

We will use the well known Performance Difference Lemma [16] in our analysis. We can obtain a performance difference lemma for the meta-policies as follows. Here, we get (a)is from Assumption 3.1 from which we have P In this section, we describe all the simulation and real-world environments in detail. B.1 Simulation Environments Point 2DNavigation: Point 2DNavigation [9] is a 2 dimensional goal reaching environment with S R2, A R2, and the following dynamics, xt+1 = xt +dxt, yt+1 = xt +dyt, such that dx2t +dy2t 0.12 Where xt and yt are the x and y location of the agent, dxt and dyt are the actions taken which correspond to the displacement in the x and y direction respectively, all taken at time step t. The goals are located on a semi circle of radius 2, and the episode terminates when the agent reaches the goal or spends more than 100time steps in the environment.

arxiv preprint arxiv, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)

Add feedback

122f45f4d451617ac87adf7024ee14cd-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 16:31:20 GMT

demonstration data, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.99)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

10a6bdcabbd5a3d36b760daa295f63c1-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 15:32:28 GMT

machine learning, natural language, reinforcement learning, (15 more...)

Neural Information Processing Systems

Industry:

Leisure & Entertainment > Games (0.68)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

Regularizing Trajectory Optimization with Denoising Autoencoders

Rinu Boney, Norman Di Palo, Mathias Berglund, Alexander Ilin, Juho Kannala, Antti Rasmus, Harri Valpola

Neural Information Processing SystemsFeb-19-2026, 10:53:55 GMT

Neural Information Processing Systems http://nips.cc/

international conference, optimization, trajectory optimization, (15 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

0e9b734aa25ca8096cb7b56dc0dd8929-Paper.pdf

Neural Information Processing SystemsFeb-18-2026, 22:31:39 GMT

agent, baseline, experiment, (14 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Germany > Berlin (0.04)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)

Add feedback

Doubly Robust Augmented Transfer for Meta-Reinforcement Learning

Anonymous Authors

Neural Information Processing SystemsFeb-17-2026, 22:42:21 GMT

RL problems through the idea of "learning to learn". Current meta-RL methods can be classified in to two categories. These methods mainly differ in their ways of inference [3, 4, 20]. The other line follows the technique of relabeling that enables sample reuse across tasks, i.e., learning a task Packer et al. apply hindsight relabeling for meta-RL, and propose hindsight task relabeling (HTR) to relabel the trajectories Taking a step further than hindsight relabelling, Wan et al. introduce additionally foresight Huang et al. derive a general form of policy gradient from DR value estimator [29], whereas a DR off-policy actor-critic Kallus et al. propose the doubly robust method to find a robust policy that can Depending on the knowledge to be transferred, these methods in RL can be roughly divided into classes including sampled transitions [32, 33], learned policies or value networks [34, 35, 36, 37], features [38, 39, 40], and skills [41, 42]. Doubly Robust Property for Direct Use of Doubly Robust Estimator We show the doubly robust property of the DR estimator for value function in Eq. (5) in the main text, as follows.

dr ij, machine learning, reinforcement learning, (10 more...)

Neural Information Processing Systems

Industry: Education (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Add feedback

Filters

Collaborating Authors

pieter abbeel

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

2a8009525763356ad5e3bb48b7475b4d-Paper.pdf

217a2a387f52c30755c37b0a73430291-Paper-Datasets_and_Benchmarks.pdf

1e0f65eb20acbfb27ee05ddc000b50ec-Paper.pdf

0e9b734aa25ca8096cb7b56dc0dd8929-Paper.pdf

Supplementary material: Enhanced Meta Reinforcement Learning using Demonstrations in Sparse Reward Environments

122f45f4d451617ac87adf7024ee14cd-Paper-Conference.pdf

10a6bdcabbd5a3d36b760daa295f63c1-Paper-Conference.pdf

Regularizing Trajectory Optimization with Denoising Autoencoders

0e9b734aa25ca8096cb7b56dc0dd8929-Paper.pdf

Doubly Robust Augmented Transfer for Meta-Reinforcement Learning