End-to-end learning has recently emerged as a promising technique to tackle the problem of autonomous driving. Existing works show that learning a navigation policy from raw sensor data may reduce the system's reliance on external sensing systems, (e.g. GPS), and/or outperform traditional methods based on state estimation and planning. However, existing end-to-end methods generally trade off performance for safety, hindering their diffusion to real-life applications. For example, when confronted with an input which is radically different from the training data, end-to-end autonomous driving systems are likely to fail, compromising the safety of the vehicle. To detect such failure cases, this work proposes a general framework for uncertainty estimation which enables a policy trained end-to-end to predict not only action commands, but also a confidence about its own predictions. In contrast to previous works, our framework can be applied to any existing neural network and task, without the need to change the network's architecture or loss, or to train the network. In order to do so, we generate confidence levels by forward propagation of input and model uncertainties using Bayesian inference. We test our framework on the task of steering angle regression for an autonomous car, and compare our approach to existing methods with both qualitative and quantitative results on a real dataset. Finally, we show an interesting by-product of our framework: robustness against adversarial attacks.
This article follows my previous one on Bayesian probability & probabilistic programming that I published few months ago on LinkedIn. And for the purpose of this article, I am going to assume that most this article readers have some idea what a Neural Network or Artificial Neural Network is. Neural Network is a non-linear function approximator. We can think of it as a parameterized function where the parameters are the weights & biases of Neural Network through which we will be typically passing our data (inputs), that will be converted to a probability between 0 and 1, to some kind of non-linearity such as a sigmoid function and help make our predictions or estimations. These non-linear functions can be composed together hence Deep Learning Neural Network with multiple layers of this function compositions.
Our brains are wired in a way that they can differentiate between objects, both living and non-living by simply looking at them. In fact, the recognition of objects and a situation through visualization is the fastest way to gather, as well as to relate information. This becomes a pretty big deal for computers where a vast amount of data has to be stuffed into it, before the computer can perform an operation on its own. Ironically, with each passing day, it is becoming essential for machines to identify objects through facial recognition, so that humans can take the next big step towards a more scientifically advanced social mechanism. So, what progress have we really made in that respect?
DeepFakes are created by a deep learning technique known as Generative Adversarial Networks (GANs), where two machine learning models are used to make the counterfeits more believable. By studying the images and videos of a person, in the form of training data, the first model creates a video, while the second model attempts to detect its flaws. These two models work hand-in-hand until they create a video that is believable. DeepFake opens up a whole new world when it comes to unsupervised learning, which is a sub-field of machine learning where machines can learn to teach themselves, and it has been argued to hold great promise when it comes to self-driving vehicles' to detect and recognize obstacles on the road and virtual assistants such as Siri, Cortana and Alexa learning to be more conversational. The real question is, what potential does it have of being misused, like any other technology.
Anyone that might be concerned about computers taking over look away now, because they are a step closer to sounding just like humans. Researchers in the UK at Google's DeepMind unit have been working on making computer-generated speech sound as "natural" as humans. The technology, called WaveNet, which is focused on the area of speech synthesis, or text-to-speech, was found to sound more natural than any of Google's products. However, this was only achieved after the WaveNet artificial neural network was trained to produce English and Chinese speech which required copious amounts of computing power, so the technology probably won't be hitting the mainstream any time soon. Using a convolutional neural network, which is used for artificial intelligence in deep learning, it is trained on data and then the systems make inferences about new data, in addition to being used to generate new data.