Deep Learning
artificial-intelligence-drive-next-wave.html?utm_content=buffer9757c&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
Google's recent acquisition of Halli Labs, an artificial intelligence (AI) and machine learning (ML) startup started by an IIT-Delhi alumni Pankaj Gupta, has fuelled Bengaluru's ambition of becoming the hub of AI and ML product startups. Halli, which means a village in Kannada, was born five months ago in Bengaluru for developing solutions to traditional problems using AI, ML, deep learning and natural language processing technologies. The company says it is focused on building deep learning and ML systems to address'old problems'. Besides IBM's Watson, which the company describes as a "cognitive" system that uses artificial intelligence (AI) technologies mostly in healthcare and education and IPsoft's Amelia, Microsoft Corporation's AI and Research Group, Amazon AI Services, Facebook AI Research (FAIR) and OpenAI, a non-profit lab partly funded by Elon Musk of Tesla are doing enormous work around in this area.
Artificial Intelligence to drive next wave of startups
Google's recent acquisition of Halli Labs, an artificial intelligence (AI) and machine learning (ML) startup started by an IIT-Delhi alumni Pankaj Gupta, has fuelled Bengaluru's ambition of becoming the hub of AI and ML product startups. Halli, which means a village in Kannada, was born five months ago in Bengaluru for developing solutions to traditional problems using AI, ML, deep learning and natural language processing technologies. Commenting on the development, Google's vice-president for product management Ceasar Sengupta tweeted, "Welcome Pankaj and the team at Halli Labs to Google. The company says it is focused on building deep learning and ML systems to address'old problems'. Gupta said the company will be joining Google's Next Billion Users team. "Halli Labs will help get more technology and information into more people's hands around the world," he said. Gupta is interested in the areas of personalisation, applied machine learning, AI, user growth and engagement, search, recommendation and discovery products, distributed systems, graph infrastructure and algorithms. He has published over 30 papers and filed more than 20 patent applications. Google and its parent company Alphabet are vigorously persuing acqui-hiring in AI startups along with other technology giants Baidu, Samsung, Microsoft, Apple, Facebook and Snap. According to a startup founder working in the similar space, AI and ML are still in their initial stages, just like how smartphone and mobile apps were a decade ago. "All startup founders are very much aware of its importance.
Teaching A.I. Systems to Behave Themselves
At OpenAI, the artificial intelligence lab founded by Tesla's chief executive, Elon Musk, machines are teaching themselves to behave like humans. But sometimes, this goes wrong. Sitting inside OpenAI's San Francisco offices on a recent afternoon, the researcher Dario Amodei showed off an autonomous system that taught itself to play Coast Runners, an old boat-racing video game. The winner is the boat with the most points that also crosses the finish line. The result was surprising: The boat was far too interested in the little green widgets that popped up on the screen.
Learning to Plan Chemical Syntheses
Segler, Marwin H. S., Preuss, Mike, Waller, Mark P.
From medicines to materials, small organic molecules are indispensable for human well-being. To plan their syntheses, chemists employ a problem solving technique called retrosynthesis. In retrosynthesis, target molecules are recursively transformed into increasingly simpler precursor compounds until a set of readily available starting materials is obtained. Computer-aided retrosynthesis would be a highly valuable tool, however, past approaches were slow and provided results of unsatisfactory quality. Here, we employ Monte Carlo Tree Search (MCTS) to efficiently discover retrosynthetic routes. MCTS was combined with an expansion policy network that guides the search, and an "in-scope" filter network to pre-select the most promising retrosynthetic steps. These deep neural networks were trained on 12 million reactions, which represents essentially all reactions ever published in organic chemistry. Our system solves almost twice as many molecules and is 30 times faster in comparison to the traditional search method based on extracted rules and hand-coded heuristics. Finally after a 60 year history of computer-aided synthesis planning, chemists can no longer distinguish between routes generated by a computer system and real routes taken from the scientific literature. We anticipate that our method will accelerate drug and materials discovery by assisting chemists to plan better syntheses faster, and by enabling fully automated robot synthesis.
A Dual-Stage Attention-Based Recurrent Neural Network for Time Series Prediction
Qin, Yao, Song, Dongjin, Chen, Haifeng, Cheng, Wei, Jiang, Guofei, Cottrell, Garrison
The Nonlinear autoregressive exogenous (NARX) model, which predicts the current value of a time series based upon its previous values as well as the current and past values of multiple driving (exogenous) series, has been studied for decades. Despite the fact that various NARX models have been developed, few of them can capture the long-term temporal dependencies appropriately and select the relevant driving series to make predictions. In this paper, we propose a dual-stage attention-based recurrent neural network (DA-RNN) to address these two issues. In the first stage, we introduce an input attention mechanism to adaptively extract relevant driving series (a.k.a., input features) at each time step by referring to the previous encoder hidden state. In the second stage, we use a temporal attention mechanism to select relevant encoder hidden states across all time steps. With this dual-stage attention scheme, our model can not only make predictions effectively, but can also be easily interpreted. Thorough empirical studies based upon the SML 2010 dataset and the NASDAQ 100 Stock dataset demonstrate that the DA-RNN can outperform state-of-the-art methods for time series prediction.
Graph Classification via Deep Learning with Virtual Nodes
Pham, Trang, Tran, Truyen, Dam, Hoa, Venkatesh, Svetha
Learning representation for graph classification turns a variable-size graph into a fixed-size vector (or matrix). Such a representation works nicely with algebraic manipulations. Here we introduce a simple method to augment an attributed graph with a virtual node that is bidirectionally connected to all existing nodes. The virtual node represents the latent aspects of the graph, which are not immediately available from the attributes and local connectivity structures. The expanded graph is then put through any node representation method. The representation of the virtual node is then the representation of the entire graph. In this paper, we use the recently introduced Column Network for the expanded graph, resulting in a new end-to-end graph classification model dubbed Virtual Column Network (VCN). The model is validated on two tasks: (i) predicting bio-activity of chemical compounds, and (ii) finding software vulnerability from source code. Results demonstrate that VCN is competitive against well-established rivals.
Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner
Chen, Tseng-Hung, Liao, Yuan-Hong, Chuang, Ching-Yao, Hsu, Wan-Ting, Fu, Jianlong, Sun, Min
Impressive image captioning results are achieved in domains with plenty of training image and sentence pairs (e.g., MSCOCO). However, transferring to a target domain with significant domain shifts but no paired training data (referred to as cross-domain image captioning) remains largely unexplored. We propose a novel adversarial training procedure to leverage unpaired data in the target domain. Two critic networks are introduced to guide the captioner, namely domain critic and multi-modal critic. The domain critic assesses whether the generated sentences are indistinguishable from sentences in the target domain. The multi-modal critic assesses whether an image and its generated sentence are a valid pair. During training, the critics and captioner act as adversaries -- captioner aims to generate indistinguishable sentences, whereas critics aim at distinguishing them. The assessment improves the captioner through policy gradient updates. During inference, we further propose a novel critic-based planning method to select high-quality sentences without additional supervision (e.g., tags). To evaluate, we use MSCOCO as the source domain and four other datasets (CUB-200-2011, Oxford-102, TGIF, and Flickr30k) as the target domains. Our method consistently performs well on all datasets. In particular, on CUB-200-2011, we achieve 21.8% CIDEr-D improvement after adaptation. Utilizing critics during inference further gives another 4.5% boost.
deepmind/pysc2
This is a collaboration between DeepMind and Blizzard to develop StarCraft II into a rich environment for RL research. PySC2 provides an interface for RL agents to interact with StarCraft 2, getting observations and sending actions. We have published an accompanying blogpost and paper, which outlines our motivation for using StarCraft II for DeepRL research, and some initial research results using the environment. Disclaimer: This is not an official Google product. You can reach us at pysc2@deepmind.com.
OpenAI bot remains undefeated against world's greatest Dota 2 players
Last night, OpenAI's Dota 2 bot beat the world's most celebrated professional players in one-on-one battles, showing just how advanced these machine learning systems are getting. The bot beat Danil "Dendi" Ishutin rather easily at The International, one of the biggest eSports events in the world, and remains undefeated against the world's top Dota 2 players. Elon Musk's OpenAI trained the bot by simply copying the AI and letting the two play each other for weeks on end. "We've coached it to learn just from playing against itself," said OpenAI researcher Jakub Pachoki. "So we didn't hard-code in any strategy, we didn't have it learn from human experts, just from the very beginning, it just keeps playing against a copy of itself. It starts from complete randomness and then it makes very small improvements, and eventually it's just pro level."
IBM's Breakthrough Distributed Computation for Deep Learning Workloads
Then the requirements change; much more data needs to be added to the process to get the project done in a reasonable span of time. Logic says that all you need to do is add more horsepower to do the job. As Dana Carvey used to say in his comedy act when satirizing President George H.W. Bush: "Not gonna do it." That's right: Until today, adding more servers would not have solved the problem. Deep-learning analytics systems up to now have only been able to run on a single server; use cases simply haven't been scalable by adding more servers, and there are major backend reasons for that.