World Models

Apr-18-2018, 09:06:40 GMT–#artificialintelligence

This weakness could be the reason that many previous works that learn dynamics models of RL environments but don't actually use those models to fully replace the actual environments . Like in the M model proposed in, the dynamics model is a deterministic differentiable model, making the model easily exploitable by the agent if it is not perfect. Using Bayesian models, as in PILCO, helps to address this issue with the uncertainty estimates to some extent, however, they do not fully solve the problem. Recent work combines the model-based approach with traditional model-free RL training by first initializing the policy network with the learned policy, but must subsequently rely on a model-free method to fine-tune this policy in the actual environment.In Learning to Think, it is acceptable that the RNN M isn't always a reliable predictor. A (potentially evolution-based) RNN C can in principle learn to ignore a flawed M, or exploit certain useful parts of M for arbitrary computational purposes including hierarchical planning etc.

actual environment, artificial intelligence, world model, (2 more...)

#artificialintelligence

Apr-18-2018, 09:06:40 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Cognitive Science > Problem Solving (0.40)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found