Sample-efficient AI

Dec-9-2021, 04:40:16 GMT–#artificialintelligence

Since AlphaGo, AI researchers have recognized the promise of integrating reinforcement learning with search methods, which involve considering many potential next actions available to an RL agent, and simulating what their results might be before choosing one. This starts to mimic human deliberation much more closely, by explicitly introducing elements of "planning" into the RL paradigm. Yang attributes the huge performance improvements of AlphaGo, AlphaZero and MuZero to this search process. Another important distinction in RL is between model-based systems, which construct explicit models of their environments, and model-free systems, which don't. Prior to AlphaGo, just about all leading RL work was done on model-free systems (PPO and deep Q learning, for example). Model-based systems just weren't practical because the learning environment models is hard, and adds a significant layer of complexity on top of the simpler action selection task that model-free systems could focus on exclusively.

efficientzero, environment model, model-free system, (17 more...)

#artificialintelligence

Dec-9-2021, 04:40:16 GMT

News Web Page

Add feedback

Industry:
- Leisure & Entertainment > Games > Computer Games (0.56)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found