Deqian Kong

May-25-2025, 19:27:12 GMT–Neural Information Processing Systems

In tasks aiming for long-term returns, planning becomes essential. We study generative modeling for planning with datasets repurposed from offline reinforcement learning. Specifically, we identify temporal consistency in the absence of step-wise rewards as one key technical challenge. We introduce the Latent Plan Transformer (LPT), a novel model that leverages a latent variable to connect a Transformerbased trajectory generator and the final return. LPT can be learned with maximum likelihood estimation on trajectory-return pairs.

machine learning, reinforcement learning, trajectory, (16 more...)

Neural Information Processing Systems

May-25-2025, 19:27:12 GMT

Conferences PDF

Add feedback

Country:
- Asia > China (0.14)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (0.68)
    - Neural Networks (1.00)
    - Reinforcement Learning (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.68)