Generative Intrinsic Optimization: Intrinsic Control with Model Learning

Nov-14-2023–arXiv.org Artificial Intelligence

Future sequence represents the outcome after executing the action into the environment (i.e. the trajectory onwards). When driven by the information-theoretic concept of mutual information, it seeks maximally informative consequences. Explicit outcomes may vary across state, return, or trajectory serving different purposes such as credit assignment or imitation learning. However, the inherent nature of incorporating intrinsic motivation with reward maximization is often neglected. In this work, we propose a policy iteration scheme that seamlessly incorporates the mutual information, ensuring convergence to the optimal policy. Concurrently, a variational approach is introduced, which jointly learns the necessary quantity for estimating the mutual information and the dynamics model, providing a general framework for incorporating different forms of outcomes of interest. While we mainly focus on theoretical analysis, our approach opens the possibilities of leveraging intrinsic control with model learning to enhance sample efficiency and incorporate uncertainty of the environment into decision-making.

information, international conference, proceedings, (13 more...)

arXiv.org Artificial Intelligence

Nov-14-2023

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales > Sydney (0.04)
- North America
  - United States
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - California
      - Los Angeles County > Long Beach (0.14)
      - Santa Clara County > Mountain View (0.04)
      - Alameda County > Berkeley (0.04)
  - Puerto Rico > San Juan
    - San Juan (0.04)
  - Canada
    - Quebec > Montreal (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.04)
    - Alberta > Census Division No. 15
      - Improvement District No. 9 > Banff (0.04)
- Europe
  - France (0.04)
  - Sweden > Stockholm
    - Stockholm (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
- Asia > Vietnam
  - Hanoi > Hanoi (0.04)

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Robots (0.88)
  - Machine Learning
    - Reinforcement Learning (0.48)
    - Neural Networks > Deep Learning (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found