Goto

Collaborating Authors

 otheragent


df308fd90635b28d82558cf580c73ed9-AuthorFeedback.pdf

Neural Information Processing Systems

This is a static policy which always uses the same attribute until a11 negativereward isencountered (L167), atwhich point anewattribute issampled.