First-Explore, then Exploit: Meta-Learning to Solve Hard Exploration-Exploitation Trade-Offs Ben Norman

Aug-17-2025, 02:58:47 GMT–Neural Information Processing Systems

The objective is to maximize the total reward accumulated over all episodes (e.g., the number of games won), expressed as

artificial intelligence, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Aug-17-2025, 02:58:47 GMT

Conferences PDF

Country:
- North America > Canada (0.46)

Genre:
- Research Report > Experimental Study (1.00)

Industry:
- Energy > Oil & Gas > Upstream (0.82)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (0.94)
  - Learning Graphical Models > Undirected Networks
    - Markov Models (0.46)

Duplicate Docs Excel Report

Title
30754e5f4cd69d64b5527cdd87d3cf62-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found