Goto

Collaborating Authors

 Reinforcement Learning


1a669e81c8093745261889539694be7f-Paper.pdf

Neural Information Processing Systems

Inthesecondexample, vacuuming the floors of a house has certain risks, but the consequences of optimizing the wrong rewardfunction arearguably muchlesssignificant.