Implementing TD3 to train a Neural Network to fly a Quadcopter through an FPV Gate

Thomas, Patrick, Schroeder, Kevin, Black, Jonathan

arXiv.org Artificial Intelligence 

Over the past few years, Reinforcement Learning has shown to have the capacity to train Deep Neural Networks to perform complex tasks. This paper investigates the use of a Deep Reinforcement Learning algorithm, Twin Delayed Deep Deterministic Policy Gradient, to learn a policy to fly a quadcopter through a First Person View(FPV) drone racing gate. BattleDrones is an autonomous drone racing competition held by Virginia Tech. Teams must design a controller to navigate a quadcopter through a course consisting of multiple gates as part of the competition. The quadcopter is outfitted with a camera that is used to identify an AprilTag [1], a fiducial marker, on the gates.