Causal confounds in sequential decision making

Dec-6-2022, 13:30:05 GMT–AIHub

A standard assumption in sequential decision making is that we observe everything required to make good decisions. We discuss two specific examples (temporally correlated noise (a) and unobserved contexts (c)) that have stymied the use of IL/RL algorithms (in autonomous helicopters (b) and self-driving (d)). We derive provably correct algorithms for both of these problems that scale to continuous control problems. Reinforcement Learning (RL) and Imitation Learning (IL) methods have achieved impressive results in recent years like beating the world champion at Go or controlling stratospheric balloons. Usually, these results are on problems where we either a) observe the full state or b) are able to faithfully execute our intended actions on the system.

algorithm, learner, sequential decision, (14 more...)

AIHub

Dec-6-2022, 13:30:05 GMT

News Web Page

Add feedback

Industry:
- Transportation (0.36)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found