Playing Space Invaders Blind RL & Cross Modality Transfer


In the 1975 film Tommy, the "deaf, dumb, and blind" protagonist overcomes substantial sensory limitations to capture a pinball championship. While it's difficult to imagine playing a video game without being able to see the screen, that was the challenge taken up by AI researchers from INESC-ID and Instituto Superior Técnico in Lisbon and Pittsburgh's Carnegie Mellon University. Using cross-modality transfer techniques and reinforcement learning (RL), the researchers produced an agent that can play video games with only the game audio to guide it. In some respects, an RL policy learned over image and sound inputs succeeding when only sound inputs are available mimics the available sensory data leveraging process that comes as second nature to humans -- we use touch and hearing for example to navigate through a dark room. The new cross-modality transfer RL approach explores how latent representations built by advanced variational autoencoder (VAE) methods might enable RL agents to learn and transfer policies over different input modalities.

