AITopics | pchid

Collaborating Authors

pchid

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Policy Continuation with Hindsight Inverse Dynamics

Hao Sun, Zhizhong Li, Xiaotong Liu, Bolei Zhou, Dahua Lin

Neural Information Processing SystemsFeb-11-2026, 22:12:01 GMT

Neural Information Processing Systems http://nips.cc/

hindsight inverse dynamic, learning, pchid, (11 more...)

Neural Information Processing Systems

Country:

Asia > Vietnam > Hanoi > Hanoi (0.04)
North America > Canada (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
(2 more...)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

3891b14b5d8cce2fdd8dcdb4ded28f6d-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-11-2026, 22:11:47 GMT

extrapolation, k-step solvability, pchid, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Policy Continuation with Hindsight Inverse Dynamics

Hao Sun, Zhizhong Li, Xiaotong Liu, Bolei Zhou, Dahua Lin

Neural Information Processing SystemsOct-2-2025, 13:16:53 GMT

Solving goal-oriented tasks is an important but challenging problem in reinforcement learning (RL).

artificial intelligence, machine learning, reinforcement learning, (11 more...)

Neural Information Processing Systems

Country: Asia (0.28)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

To Review 1: 2 Q1: The connection between the policy and the Hindsight Inverse Dynamics(HID). Instead of mapping (s

Neural Information Processing SystemsOct-2-2025, 13:16:38 GMT

We thank all reviewers for their insightful comments. Please see the responses below. Q2: Why is it important to relabel data to learn HID? And multistep HIDs help such extrapolations in non-trivial cases. And Fig.1(b) below shows similar results in For most goal-oriented tasks, the learning objective is to find a policy to reach the goal as soon as possible.

artificial intelligence, machine learning, pchid, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Policy Continuation with Hindsight Inverse Dynamics

Sun, Hao, Li, Zhizhong, Liu, Xiaotong, Lin, Dahua, Zhou, Bolei

arXiv.org Machine LearningNov-1-2019

Solving goal-oriented tasks is an important but challenging problem in reinforcement learning (RL). For such tasks, the rewards are often sparse, making it difficult to learn a policy effectively. To tackle this difficulty, we propose a new approach called Policy Continuation with Hindsight Inverse Dynamics (PCHID). This approach learns from Hindsight Inverse Dynamics based on Hindsight Experience Replay, enabling the learning process in a self-imitated manner and thus can be trained with supervised learning. This work also extends it to multi-step settings with Policy Continuation. The proposed method is general, which can work in isolation or be combined with other on-policy and off-policy algorithms. On two multi-goal tasks GridWorld and FetchReach, PCHID significantly improves the sample efficiency as well as the final performance.

hindsight inverse dynamic, learning, pchid, (11 more...)

arXiv.org Machine Learning

1910.14055

Country:

Asia > Vietnam > Hanoi > Hanoi (0.04)
North America > Canada (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games > Computer Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback