AITopics | kl-divergence

Guided Policy Search via Approximate Mirror Descent

Neural Information Processing SystemsMay-1-2026, 06:07:20 GMT

Guided policy search algorithms can be used to optimize complex nonlinear policies, such as deep neural networks, without directly computing policy gradients in the high-dimensional parameter space. Instead, these methods use supervised learning to train the policy to mimic a "teacher" algorithm, such as a trajectory optimizer or a trajectory-centric reinforcement learning method. Guided policy search methods provide asymptotic local convergence guarantees by construction, but it is not clear how much the policy improves within a small, finite number of iterations. We show that guided policy search algorithms can be interpreted as an approximate variant of mirror descent, where the projection onto the constraint manifold is not exact. We derive a new guided policy search algorithm that is simpler and provides appealing improvement and convergence guarantees in simplified convex and linear settings, and show that in the more general nonlinear setting, the error in the projection step can be bounded. We provide empirical results on several simulated robotic navigation and manipulation tasks that show that our method is stable and achieves similar or better performance when compared to prior guided policy search methods, with a simpler formulation and fewer hyperparameters.

local policy, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

952285b9b7e7a1be5aa7849f32ffff05-Supplemental.pdf

Neural Information Processing SystemsApr-26-2026, 16:42:33 GMT

artificial intelligence, linear, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

024d2d699e6c1a82c9ba986386f4d824-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 10:31:28 GMT

artificial intelligence, data mining, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

Add feedback

024d2d699e6c1a82c9ba986386f4d824-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 10:31:24 GMT

artificial intelligence, data mining, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
(2 more...)

Add feedback

NeurIPS2022_camera

Neural Information Processing SystemsApr-24-2026, 07:53:43 GMT

artificial intelligence, gofar, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On Learning Markov Chains

Yi Hao, Alon Orlitsky, Venkatadheeraj Pichapati

Neural Information Processing SystemsFeb-14-2026, 17:45:55 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, markov chain, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > California > San Diego County > La Jolla (0.05)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

b91f4f4d36fa98a94ac5584af95594a0-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-13-2026, 19:17:30 GMT

We mitigate the usual worst-case nature of minimax analysis by showing that our bounds are tight for any given31 hypothesis class, and, tight in any noise regime (Theorems 1 and 2).

artificial intelligence, discrepancy, theorem statement, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness

Neural Information Processing SystemsFeb-12-2026, 17:13:12 GMT

Ensemble approaches for uncertainty estimation have recently been applied to the tasks of misclassification detection, out-of-distribution input detection and adversarial attack detection. Prior Networks have been proposed as an approach to efficiently emulate an ensemble of models for classification by parameteris-ing a Dirichlet prior distribution over output distributions.

adversarial attack, artificial intelligence, machine learning, (20 more...)

Neural Information Processing Systems

Country: