New IEEE Research Equips Gradient Descent with Angular Information to Boost DNN Training
Deep Neural Networks (DNNs) have achieved outstanding results across a wide range of hot-topic tasks in computer vision and natural language processing. These achievements however come with a high cost, as solving increasingly complex tasks requires increasingly deep neural network architectures. Moreover, today's deepening architectures not only increase the computational burden, they can also suffer from vanishing gradient problems. Recent efforts to tackle the vanishing gradient problem in DNN training have leveraged advanced optimizers such as the adaptive moment estimation (Adam) optimizer in model training, but such existing optimizers are unable to exploit any gradient angular information other than magnitude. To overcome these limitations, a team from the IEEE (Institute of Electrical and Electronics Engineers) has proposed AngularGrad -- a novel optimization algorithm that takes both gradient direction and angular information into consideration.
May-30-2021, 21:00:12 GMT
- Technology: