Goto

Collaborating Authors

 Deep Learning


Tesla hires deep learning expert Andrej Karpathy to lead Autopilot vision

@machinelearnbot

Tesla has hired deep learning and computer vision expert Andrej Karpathy in a key Autopilot role. Karpathy most recently held a role as a researcher at OpenAI, the artificial intelligence nonprofit backed by Elon Musk. He has an extensive background in AI-related fields, having completed a PhD at Stanford University in computer vision. Karpathy also created one of the original, and most respected, deep learning courses taught at Stanford, and his dissertation work focused on creating a system by which a neural network could identify multiple discrete and specific items within an image, label them using natural language and report to a user. The dissertation also included developing a system that works in reverse, allowing for a model that can use descriptions from a user articulated in natural language (i.e.


Tesla reshuffles its Autopilot self-driving team

Engadget

Earlier this year Tesla announced engineer Chris Lattner would leave Apple and lead its Autopilot engineering team, but just five months later he is departing. Lattner, the designer of Apple's Swift programming language, tweeted "Turns out that Tesla isn't a good fit for me after all," while Tesla announced it has hired Andrej Karpathy, "one of the world's leading experts in computer vision and deep learning." He will become the company's Director of AI and Autopilot Vision, reporting directly to CEO Elon Musk, who he may know well from his previous job as a research scientist at the Musk-backed OpenAI. Andrej Karpathy, one of the world's leading experts in computer vision and deep learning, is joining Tesla as Director of AI and Autopilot Vision, reporting directly to Elon Musk. Andrej has worked to give computers vision through his work on ImageNet, as well as imagination through the development of generative models, and the ability to navigate the internet with reinforcement learning.


[N] Andrej Karpathy leaves OpenAI for Tesla ('Director of AI and Autopilot Vision') โ€ข r/MachineLearning

@machinelearnbot

That said, I would have thought he was a bit young for a "Director" role. Most other big tech companies have directors of his professors' generation. Not doubting his skill, ability to communicate, or his passion, it just seems a pretty surprising move from a large company. Has Andrej ever managed a team before (beyond running a course or supervising some students)? And does he have any serious SDC experience?


How chatbots are changing customer service dynamics in banking

#artificialintelligence

Ever wondered about the volume of calls to banks? It is probably inevitable that banks would turn to chatbots sooner or later. The most expensive forms of interaction with customers are the personal ones: face-to-face, and by telephone. This explains why banks have encouraged customers to adopt internet banking. But it also explains why banks are looking so hard at chatbots: reduce the price of telephone interaction by using bots, and you have made some serious efficiencies. But of course to succeed, the chatbots have to be effective.


Deep Learning at the Speed of Light on Nanophotonic Chips

#artificialintelligence

Deep learning has transformed the field of artificial intelligence, but the limitations of conventional computer hardware are already hindering progress. Researchers at MIT think their new "nanophotonic" processor could be the answer by carrying out deep learning at the speed of light. In the 1980s, scientists and engineers hailed optical computing as the next great revolution in information technology, but it turned out that bulky components like fiber optic cables and lenses didn't make for particularly robust or compact computers. In particular, they found it extremely challenging to make scalable optical logic gates, and therefore impractical to make general optical computers, according to MIT physics post-doc Yichen Shen. One thing light is good at, though, is multiplying matrices--arrays of numbers arranged in columns and rows.


Learning Localized Geometric Features Using 3D-CNN: An Application to Manufacturability Analysis of Drilled Holes

arXiv.org Machine Learning

In this paper, we present a 3D-CNN based method to learn distinct local geometric features of interest within an object. In this context, the voxelized representation may not be sufficient to capture the distinguishing information about such local features. To enable efficient learning, we augment the voxel data with surface normals of the object boundary. We then train a 3D-CNN with this augmented data and identify the local features critical for decision-making using 3D gradient-weighted class activation maps. An application of this feature identification framework is to recognize difficult-to-manufacture drilled hole features in a complex CAD geometry. The framework can be extended to identify difficult-to-manufacture features at multiple spatial scales leading to a real-time decision support system for design for manufacturability.


A Useful Motif for Flexible Task Learning in an Embodied Two-Dimensional Visual Environment

arXiv.org Machine Learning

Animals (especially humans) have an amazing ability to learn new tasks quickly, and switch between them flexibly. How brains support this ability is largely unknown, both neuroscientifically and algorithmically. One reasonable supposition is that modules drawing on an underlying general-purpose sensory representation are dynamically allocated on a per-task basis. Recent results from neuroscience and artificial intelligence suggest the role of the general purpose visual representation may be played by a deep convolutional neural network, and give some clues how task modules based on such a representation might be discovered and constructed. In this work, we investigate module architectures in an embodied two-dimensional touchscreen environment, in which an agent's learning must occur via interactions with an environment that emits images and rewards, and accepts touches as input. This environment is designed to capture the physical structure of the task environments that are commonly deployed in visual neuroscience and psychophysics. We show that in this context, very simple changes in the nonlinear activations used by such a module can significantly influence how fast it is at learning visual tasks and how suitable it is for switching to new tasks.


Comparing deep neural networks against humans: object recognition when the signal gets weaker

arXiv.org Machine Learning

Human visual object recognition is typically rapid and seemingly effortless, as well as largely independent of viewpoint and object orientation. Until very recently, animate visual systems were the only ones capable of this remarkable computational feat. This has changed with the rise of a class of computer vision algorithms called deep neural networks (DNNs) that achieve human-level classification performance on object recognition tasks. Furthermore, a growing number of studies report similarities in the way DNNs and the human visual system process objects, suggesting that current DNNs may be good models of human visual object recognition. Yet there clearly exist important architectural and processing differences between state-of-the-art DNNs and the primate visual system. The potential behavioural consequences of these differences are not well understood. We aim to address this issue by comparing human and DNN generalisation abilities towards image degradations. We find the human visual system to be more robust to image manipulations like contrast reduction, additive noise or novel eidolon-distortions. In addition, we find progressively diverging classification error-patterns between man and DNNs when the signal gets weaker, indicating that there may still be marked differences in the way humans and current DNNs perform visual object recognition. We envision that our findings as well as our carefully measured and freely available behavioural datasets provide a new useful benchmark for the computer vision community to improve the robustness of DNNs and a motivation for neuroscientists to search for mechanisms in the brain that could facilitate this robustness.


CBinfer: Change-Based Inference for Convolutional Neural Networks on Video Data

arXiv.org Artificial Intelligence

Extracting per-frame features using convolutional neural networks for real-time processing of video data is currently mainly performed on powerful GPU-accelerated workstations and compute clusters. However, there are many applications such as smart surveillance cameras that require or would benefit from on-site processing. To this end, we propose and evaluate a novel algorithm for change-based evaluation of CNNs for video data recorded with a static camera setting, exploiting the spatio-temporal sparsity of pixel changes. We achieve an average speed-up of 8.6x over a cuDNN baseline on a realistic benchmark with a negligible accuracy loss of less than 0.1% and no retraining of the network. The resulting energy efficiency is 10x higher than that of per-frame evaluation and reaches an equivalent of 328 GOp/s/W on the Tegra X1 platform.


AI Acquires Spatial Reasoning Abilities, in a Victory for Our Machine Overlords - ExtremeTech

#artificialintelligence

The focus of the DeepMind paper concerns spatial reasoning, in particular the ability to grasp the relation of objects to each other. This may sound simple compared with becoming an expert in chess or the like. But it's only because humans possess something like an "intuitive physics engine," an algorithm for extrapolating three-dimensionality from flat images and comparing objects within it to other objects. This kind of spatial reasoning has proved difficult for computers, at least until now. Using a combination of relational networks and convoluted neural networks, the DeepMind system can answer questions concerning the relation of objects within an image.