Goto

Collaborating Authors

Understanding the limits of CNNs, one of AI's greatest achievements

#artificialintelligence

This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI. After a prolonged winter, artificial intelligence is experiencing a scorching summer mainly thanks to advances in deep learning and artificial neural networks. To be more precise, the renewed interest in deep learning is largely due to the success of convolutional neural networks (CNNs), a neural network structure that is especially good at dealing with visual data. But what if I told you that CNNs are fundamentally flawed? That was what Geoffrey Hinton, one of the pioneers of deep learning, talked about in his keynote speech at the AAAI conference, one of the main yearly AI conferences.


Understanding the limits of convolutional neural networks -- one of AI's greatest achievements

#artificialintelligence

After a prolonged winter, artificial intelligence is experiencing a scorching summer mainly thanks to advances in deep learning and artificial neural networks. To be more precise, the renewed interest in deep learning is largely due to the success of convolutional neural networks (CNNs), a neural network structure that is especially good at dealing with visual data. But what if I told you that CNNs are fundamentally flawed? That was what Geoffrey Hinton, one of the pioneers of deep learning, talked about in his keynote speech at the AAAI conference, one of the main yearly AI conferences. Hinton, who attended the conference with Yann LeCun and Yoshua Bengio, with whom he constitutes the Turin Award–winning "godfathers of deep learning" trio, spoke about the limits of CNNs as well as capsule networks, his masterplan for the next breakthrough in AI.


Understanding the limits of convolutional neural networks -- one of AI's greatest achievements

#artificialintelligence

After a prolonged winter, artificial intelligence is experiencing a scorching summer mainly thanks to advances in deep learning and artificial neural networks. To be more precise, the renewed interest in deep learning is largely due to the success of convolutional neural networks (CNNs), a neural network structure that is especially good at dealing with visual data. But what if I told you that CNNs are fundamentally flawed? That was what Geoffrey Hinton, one of the pioneers of deep learning, talked about in his keynote speech at the AAAI conference, one of the main yearly AI conferences. Hinton, who attended the conference with Yann LeCun and Yoshua Bengio, with whom he constitutes the Turin Award–winning "godfathers of deep learning" trio, spoke about the limits of CNNs as well as capsule networks, his masterplan for the next breakthrough in AI.


Capsule Neural Networks – Part 2: What is a Capsule?

#artificialintelligence

In classic CNNs, each neuron in the first layer represents a pixel. Then, it feeds this information forward to next layers. The next convolutional layers group a bunch of neurons together, so that a single neuron there can represent a whole frame (bunch) of neurons. Thus, it can learn to represent a group of pixels that look something like a snout, especially if we have many examples of those in our dataset, and the neural net will learn to increase the weight (importance) of that snout neuron feature when identifying if that image is of a dog. However, this method solely cares about the existence of the object in the picture around a specific location; but it is insensitive to the spatial relations and direction of the object.


Pay attention! - Robustifying a Deep Visuomotor Policy through Task-Focused Attention

arXiv.org Artificial Intelligence

Several recent projects demonstrated the promise of end-to-end learned deep visuomotor policies for robot manipulator control. Despite impressive progress, these systems are known to be vulnerable to physical disturbances, such as accidental or adversarial bumps that make them drop the manipulated object. They also tend to be distracted by visual disturbances such as objects moving in the robot's field of view, even if the disturbance does not physically prevent the execution of the task. In this paper we propose a technique for augmenting a deep visuomotor policy trained through demonstrations with task-focused attention. The manipulation task is specified with a natural language text such as "move the red bowl to the left". This allows the attention component to concentrate on the current object that the robot needs to manipulate. We show that even in benign environments, the task focused attention allows the policy to consistently outperform a variant with no attention mechanism. More importantly, the new policy is significantly more robust: it regularly recovers from severe physical disturbances (such as bumps causing it to drop the object) from which the unmodified policy almost never recovers. In addition, we show that the proposed policy performs correctly in the presence of a wide class of visual disturbances, exhibiting a behavior reminiscent of human selective attention experiments.