Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling

May-30-2025, 18:01:22 GMT–Neural Information Processing Systems

Recent works have shown the remarkable superiority of transformer models in reinforcement learning (RL), where the decision-making problem is formulated as sequential generation. Transformer-based agents could emerge with selfimprovement in online environments by providing task contexts, such as multiple trajectories, called in-context RL. However, due to the quadratic computation complexity of attention in transformers, current in-context RL methods suffer from huge computational costs as the task horizon increases. In contrast, the Mamba model is renowned for its efficient ability to process long-term dependencies, which provides an opportunity for in-context RL to solve tasks that require long-term memory. To this end, we first implement Decision Mamba (DM) by replacing the backbone of Decision Transformer (DT).

machine learning, natural language, reinforcement learning, (17 more...)

Neural Information Processing Systems

May-30-2025, 18:01:22 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.28)
- North America > United States
  - Pennsylvania (0.14)

Genre:
- Research Report > Experimental Study (0.93)

Industry:
- Education (0.46)
- Information Technology (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.46)
    - Reinforcement Learning (1.00)
  - Natural Language (1.00)