structural credit assignment
Structural Credit Assignment in Neural Networks using Reinforcement Learning
Structural credit assignment in neural networks is a long-standing problem, with a variety of alternatives to backpropagation proposed to allow for local training of nodes. One of the early strategies was to treat each node as an agent and use a reinforcement learning method called REINFORCE to update each node locally with only a global reward signal. In this work, we revisit this approach and investigate if we can leverage other reinforcement learning approaches to improve learning. We first formalize training a neural network as a finite-horizon reinforcement learning problem and discuss how this facilitates using ideas from reinforcement learning like off-policy learning. We show that the standard on-policy REINFORCE approach, even with a variety of variance reduction approaches, learns suboptimal solutions. We introduce an off-policy approach, to facilitate reasoning about the greedy action for other agents and help overcome stochasticity in other agents. We conclude by showing that these networks of agents can be more robust to correlated samples when learning online.
Structural Credit Assignment in Neural Networks using Reinforcement Learning
Structural credit assignment in neural networks is a long-standing problem, with a variety of alternatives to backpropagation proposed to allow for local training of nodes. One of the early strategies was to treat each node as an agent and use a reinforcement learning method called REINFORCE to update each node locally with only a global reward signal. In this work, we revisit this approach and investigate if we can leverage other reinforcement learning approaches to improve learning. We first formalize training a neural network as a finite-horizon reinforcement learning problem and discuss how this facilitates using ideas from reinforcement learning like off-policy learning. We show that the standard on-policy REINFORCE approach, even with a variety of variance reduction approaches, learns suboptimal solutions.
Elo Ratings for Structural Credit Assignment in Multiagent Systems
Yliniemi, Logan Michael (Oregon State University) | Tumer, Kagan (Oregon State University )
In this paper we investigate the applications of Elo ratings (originally designed for 2-player chess) to a heterogeneous nonlinear multiagent system to determine an agent's overall impact on its team's performance. Measuring this impact has been attempted in many different ways, including reward shaping; the generation of heirarchies, holarchies, and teams; mechanism design; and the creation of subgoals. We show that in a multiagent system, an Elo rating will accurately reflect an agent's ability to contribute positively to a team's success with no need for any other feedback than a repeated binary win/loss signal. The Elo rating not only measures ``personal" success, but simultaneously success in assisting other agents to perform favorably.