Goto

Collaborating Authors

 Deep Learning


A Response to Yann LeCun's Response. – Yoav Goldberg – Medium

#artificialintelligence

I appreciate the interest and debate around my post, and Yann's response on facebook. Let me respond to the response. I already spend tons of time on one social network and try to not be dragged into another. Also, here I have better formatting options, and better control over the content over time. Yann referred to my previous clarification post as back-paddling.


Exploring LSTMs

#artificialintelligence

The first time I learned about LSTMs, my eyes glazed over. It turns out LSTMs are a fairly simple extension to neural networks, and they're behind a lot of the amazing achievements deep learning has made in the past few years. So I'll try to present them as intuitively as possible – in such a way that you could have discovered them yourself. Imagine we have a sequence of images from a movie, and we want to label each image with an activity (is this a fight?, are the characters talking?, are the characters eating?). One way is to ignore the sequential nature of the images, and build a per-image classifier that considers each image in isolation.


Artificial Intelligence Help

#artificialintelligence

No matter what your philosophical view of our future, increasingly, the focus on AI/Machine Learning in analytics corresponds to the next logical step, which is gaining advanced insights from Big Data. No matter what your philosophical view of our future, increasingly, the focus on AI/Machine Learning in analytics corresponds to the next logical step, which is gaining advanced insights from Big Data, the ability to accurately predict outcomes, improve productivity, and gain competitive advantage. Now, AI/Machine Learning is driving us forward and the combination of Big Data and AI will present incredible opportunities and drive innovation across almost all industries. Join us on October 5 for our dedicated Pre-Conference Focus day Machine Learning, Deep Learning and AI for Strategic Innovation and hear about the ways in which leading companies are using AI in innovative ways within their companies.


The games people play smarten up AI

#artificialintelligence

Just as we learn from our mistakes, they are--well, learning from our mistakes too. A database of Atari Gameplay was unleashed to show AI what's what with game play. Tech sites are talking about this Atari Grand Challenge dataset--worthy of attention as the dataset, said Dan Robitzski in Inverse, gives AI systems access to new ways of learning and honing skills over time. The researchers stated in their paper that "We collect and describe a large dataset of human Atari 2600 replays – the largest and most diverse such data set publicly released to date." Jordan Pearson, Motherboard: "Computer scientists from RWTH Aachen University in Germany and Microsoft Research have released the largest-ever database of human playthroughs for some of the most popular games for the Atari 2600."


AI Getting Better At Predicting When You'll Die

#artificialintelligence

Thinking about how and when you'll die might be morbid, but it has creeped into everyone's mind at some point. Online tools like The Death Clock provide a very unscientific, and entertaining, prediction of your demise, but researchers have figured out a way to estimate a person's lifespan with 69 percent accuracy. In a very small study of 48 participants, all of whom were at least 60 years old, scientists from the University of Adelaide in Australia analyzed photos of people's organs using artificial intelligence. They were able to predict who would die within five years with 69 percent accuracy, which is roughly the same as an oncologist's. Using deep learning, which involves inputting data into a computer system to help it make decisions, the researchers used radiological images because they provide undetectable clues, according to study co-author and epidemiologist Dr. Lyle Palmer, Ph.D, in a story on ResearchGate.


Artificial Intelligence Processing Moving from Cloud to Edge

#artificialintelligence

The recent rise of artificial intelligence (AI) can be partly attributed to improvements in graphics processing unit (GPU) processors, mostly deployed in cloud server architectures. GPUs are massively parallel processors that can map well to the large number of vector and matrix multiplication calculations that need to be performed in deep learning. GPUs were originally designed to perform matrix multiplication operations for three-dimensional (3D) computer graphics, but it turns out that deep learning applications have similar requirements, and GPUs have been successful in accelerating the training and inference of AI algorithms. Hyperscalar internet companies, including Google, Facebook, Amazon, and Microsoft, have built massive cloud server farms that can perform industrial-scale training and inference operations for AI, fueled by the troves of consumer data they collect, further improving their AI algorithms. NVIDIA has been the main beneficiary of this trend, as its GPUs power the majority of these cloud-based AI data centers.


How Will Artificial Intelligence Transform The Workplace?

#artificialintelligence

I'll be the first to admit that the outlook was a little bleak. When you consider the fact that these machines may be allowed to make decisions that affect mankind, without having or after evolving past the innate ethics that (most) people operate by, the future can look scary. It's time to scope out the positives now, because when it comes to AI in the workplace, there are HUGE benefits in store. Tech giants are racing to get their slice of the AI pie, and are adding more fuel to the fire - Google bought DeepMind in 2014. Facebook's planning to use neural networks to'narrate' photos to blind users, and using deep learning to find out what their users actually want.


Google Is Already Late to China's AI Revolution

#artificialintelligence

Sitting on a stage in Wuzhen, China, a historic city up the river from Shanghai, Google chairman Eric Schmidt described what he called "the age of intelligence." He trumpeted the rise of deep neural networks and other techniques that allow machines to learn tasks largely on their own, either by finding patterns in vast amounts of data or through their own trial and error. At Google, using a sweeping software tool called TensorFlow, engineers have built deep learning systems that can identify faces and objects in photos, recognize commands spoken into smartphones, and translate one language into another. Schmidt called this the biggest technological change of his lifetime. Then he mentioned China's three largest internet companies: Baidu, Tencent, and Alibaba.


Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters

arXiv.org Machine Learning

Deep learning models can take weeks to train on a single GPU-equipped machine, necessitating scaling out DL training to a GPU-cluster. However, current distributed DL implementations can scale poorly due to substantial parameter synchronization over the network, because the high throughput of GPUs allows more data batches to be processed per unit time than CPUs, leading to more frequent network synchronization. We present Poseidon, an efficient communication architecture for distributed DL on GPUs. Poseidon exploits the layered model structures in DL programs to overlap communication and computation, reducing bursty network communication. Moreover, Poseidon uses a hybrid communication scheme that optimizes the number of bytes required to synchronize each layer, according to layer properties and the number of machines. We show that Poseidon is applicable to different DL frameworks by plugging Poseidon into Caffe and TensorFlow. We show that Poseidon enables Caffe and TensorFlow to achieve 15.5x speed-up on 16 single-GPU machines, even with limited bandwidth (10GbE) and the challenging VGG19-22K network for image classification. Moreover, Poseidon-enabled TensorFlow achieves 31.5x speed-up with 32 single-GPU machines on Inception-V3, a 50% improvement over the open-source TensorFlow (20x speed-up).


An Online Learning Approach to Generative Adversarial Networks

arXiv.org Machine Learning

We consider the problem of training generative models with a Generative Adversarial Network (GAN). Although GANs can accurately model complex distributions, they are known to be difficult to train due to instabilities caused by a difficult minimax optimization problem. In this paper, we view the problem of training GANs as finding a mixed strategy in a zero-sum game. Building on ideas from online learning we propose a novel training method named Chekhov GAN 1 . On the theory side, we show that our method provably converges to an equilibrium for semi-shallow GAN architectures, i.e. architectures where the discriminator is a one layer network and the generator is arbitrary. On the practical side, we develop an efficient heuristic guided by our theoretical results, which we apply to commonly used deep GAN architectures. On several real world tasks our approach exhibits improved stability and performance compared to standard GAN training.