Difference of Convex Functions Programming for Reinforcement Learning

Bilal Piot, Matthieu Geist, Olivier Pietquin

Feb-9-2025, 03:00:55 GMT–Neural Information Processing Systems

Large Markov Decision Processes are usually solved using Approximate Dynamic Programming methods such as Approximate Value Iteration or Approximate Policy Iteration. The main contribution of this paper is to show that, alternatively, the optimal state-action value function can be estimated using Difference of Convex functions (DC) Programming.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Feb-9-2025, 03:00:55 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - Massachusetts > Middlesex County > Belmont (0.04)
- Europe > France
  - Hauts-de-France > Nord > Lille (0.04)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning
    - Reinforcement Learning (1.00)
    - Statistical Learning (0.93)
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.35)

Duplicate Docs Excel Report

Title
Difference of Convex Functions Programming for Reinforcement Learning
Difference of Convex Functions Programming for Reinforcement Learning
Difference of Convex Functions Programming for Reinforcement Learning

Similar Docs Excel Report more

Title	Similarity	Source
None found