Accelerating Model-Based Reinforcement Learning using Non-Linear Trajectory Optimization

Calì, Marco, Giacomuzzo, Giulio, Carli, Ruggero, Libera, Alberto Dalla

Jun-4-2025–arXiv.org Artificial Intelligence

This paper addresses the slow policy optimization convergence of Monte Carlo Probabilistic Inference for Learning Control (MC-PILCO), a state-of-the-art model-based reinforcement learning (MBRL) algorithm, by integrating it with iterative Linear Quadratic Regulator (iLQR), a fast trajectory optimization method suitable for nonlinear systems. The proposed method, Exploration-Boosted MC-PILCO (EB-MC-PILCO), leverages iLQR to generate informative, exploratory trajectories and initialize the policy, significantly reducing the number of required optimization steps. Experiments on the cart-pole task demonstrate that EB-MC-PILCO accelerates convergence compared to standard MC-PILCO, achieving up to $\bm{45.9\%}$ reduction in execution time when both methods solve the task in four trials. EB-MC-PILCO also maintains a $\bm{100\%}$ success rate across trials while solving the task faster, even in cases where MC-PILCO converges in fewer iterations.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

Jun-4-2025

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Reinforcement Learning (0.61)
  - Representation & Reasoning > Optimization (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found