Beyond dynamic programming

Jun-26-2023–arXiv.org Artificial Intelligence

In contrast with classical dynamic programming-based methods, our method can search over non-stationary policy functions, and can directly compute optimal infinite horizon action sequences from a given state. The central idea in our method is the construction of a mapping between infinite horizon action sequences and real numbers in a bounded interval. This construction enables us to formulate an optimization problem for directly computing optimal infinite horizon action sequences, without requiring a policy function. We demonstrate the effectiveness of our approach by applying it to nonlinear optimal control problems. Overall, our contributions provide a novel theoretical framework for formulating and solving reinforcement learning problems.

action sequence, score-life function, sequence, (17 more...)

arXiv.org Artificial Intelligence

Jun-26-2023

arXiv.org PDF

Add feedback

Country:
- North America > Canada > Ontario > Toronto (0.14)

Genre:
- Workflow (0.80)
- Research Report (0.50)

Industry:
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Optimization (0.92)
  - Machine Learning
    - Reinforcement Learning (0.90)
    - Statistical Learning > Regression (0.46)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found