Policy Gradient Coagent Networks
–Neural Information Processing Systems
We present a novel class of actor-critic algorithms for actors consisting of sets of interacting modules. We present, analyze theoretically, and empirically evaluate an update rule for each module, which requires only local information: the module's input, output, and the TD error broadcast by a critic. Such updates are necessary when computation of compatible features becomes prohibitively difficult and are also desirable to increase the biological plausibility of reinforcement learning methods.
Neural Information Processing Systems
Dec-31-2011
- Country:
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- Technology: