Near-OptimalRandomizedExplorationforTabular MarkovDecisionProcesses

Feb-8-2026, 01:07:08 GMT–Neural Information Processing Systems

These algorithms inject (carefully tuned) random noise to value function to encourage exploration. UCB-type algorithms enjoy well-established theoretical guarantees but suffer from difficult implementation since an upper confidence bound isusually infeasible for manypractical models like neural networks. Instead, practitioners prefer randomized exploration such as noisy networks in [19], and algorithms with randomized exploration have been widely used in practice [37,13,11,35].

artificial intelligence, arxivpreprintarxiv, machine learning, (18 more...)

Neural Information Processing Systems

Feb-8-2026, 01:07:08 GMT

Conferences PDF

Add feedback

Country:
- North America > United States
  - California (0.04)
- Europe
  - United Kingdom > England (0.04)
  - Romania > Sud-Est Development Region
    - Constanța County > Constanța (0.04)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
298c3e32d7d402189444be2ff5d19979-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found