Deep Learning
Four-Armed Marimba Robot Uses Deep Learning to Compose Its Own Music
The Georgia Tech Center for Music Technology, led by Gil Weinberg, has a reputation for doing incredible musical things with robots, with a mix of creativity and technical expertise in robotics and AI. We've seen projects like a cybernetic second arm for a drummer, a cybernetic third arm (!) for a drummer, and a bunch of interesting research on ways that robots can dynamically collaborate with humans in the context of improvisational music. That last thing usually features Shimon, a four-armed expressive robotic marimba player, which can analyze music in real time and improvise along with human performers. It's an impressive thing to watch, but Shimon's talents were mostly restricted to riffing on what other human musicians were doing. Now, Shimon has leveraged deep learning to create structured and coherent and totally unique compositions of its very own.
AI Holds Promise of Improving Doctors' Diagnoses
Radiologists can scrutinize hundreds of images before identifying an area of concern in a patient's body. But a type of artificial intelligence known as deep learning could soon help medical experts pinpoint problems faster and more accurately, says Dr. Michalski, executive director of the Boston-based Center for Clinical Data Science at Massachusetts General Hospital and Brigham and Women's Hospital. Deep learning includes algorithms, or computer programs, that search for, identify and analyze problems without direction from people, though many humans still guide the algorithms today. For months, Dr. Michalski has used hundreds of thousands of medical images to train a deep-learning system to detect pulmonary nodules and strokes, measure tumors and look for traumatic injury and fractures, among other tasks. The early results "are promising," he says. "The technology really does work."
This AI can recognize protesters at rallies, even in disguise
A group of researchers from the UK and India have developed an AI that can identify people using deep learning to enhance facial recognition capabilities, even when certain physical features are obscured. Translation: you could be identified from surveillance footage and photos, no matter if you're covering your face with a hat, scarf, sunglasses or a beard. The deep learning system looks at 14 points on a face and measures the distances between them in order to recognize people. By training the AI with two datasets containing 2,000 images each – one with simple backgrounds and another with more varied elements, including multiple people per picture – the team has been able to achieve a 56 percent success rate in identifying subjects in disguise. So yes, the technology hasn't been perfected yet.
A Framework for Generalizing Graph-based Representation Learning Methods
Ahmed, Nesreen K., Rossi, Ryan A., Zhou, Rong, Lee, John Boaz, Kong, Xiangnan, Willke, Theodore L., Eldardiry, Hoda
Random walks are at the heart of many existing deep learning algorithms for graph data. However, such algorithms have many limitations that arise from the use of random walks, e.g., the features resulting from these methods are unable to transfer to new nodes and graphs as they are tied to node identity. In this work, we introduce the notion of attributed random walks which serves as a basis for generalizing existing methods such as DeepWalk, node2vec, and many others that leverage random walks. Our proposed framework enables these methods to be more widely applicable for both transductive and inductive learning as well as for use on graphs with attributes (if available). This is achieved by learning functions that generalize to new nodes and graphs. We show that our proposed framework is effective with an average AUC improvement of 16.1% while requiring on average 853 times less space than existing methods on a variety of graphs from several domains.
Normalized Direction-preserving Adam
Zhang, Zijun, Ma, Lin, Li, Zongpeng, Wu, Chuan
Optimization algorithms for training deep models not only affects the convergence rate and stability of the training process, but are also highly related to the generalization performance of the models. While adaptive algorithms, such as Adam and RMSprop, have shown better optimization performance than stochastic gradient descent (SGD) in many scenarios, they often lead to worse generalization performance than SGD, when used for training deep neural networks (DNNs). In this work, we identify two problems of Adam that may degrade the generalization performance. As a solution, we propose the normalized direction-preserving Adam (ND-Adam) algorithm, which combines the best of both worlds, i.e., the good optimization performance of Adam, and the good generalization performance of SGD. In addition, we further improve the generalization performance in classification tasks, by using batch-normalized softmax. This study suggests the need for more precise control over the training process of DNNs.
Deep Asymmetric Multi-task Feature Learning
Lee, Hae Beom, Yang, Eunho, Hwang, Sung Ju
We propose Deep Asymmetric Multitask Feature Learning (Deep-AMTFL) which can learn deep representations shared across multiple tasks while effectively preventing negative transfer that may happen in the feature sharing process. Specifically, we introduce an asymmetric autoencoder term that allows reliable predictors for the easy tasks to have high contribution to the feature learning while suppressing the influences of unreliable predictors for more difficult tasks. This allows the learning of less noisy representations, and enables unreliable predictors to exploit knowledge from the reliable predictors via the shared latent features. Such asymmetric knowledge transfer through shared features is also more scalable and efficient than inter-task asymmetric transfer. We validate our Deep-AMTFL model on multiple benchmark datasets for multitask learning and image classification, on which it significantly outperforms existing symmetric and asymmetric multitask learning models, by effectively preventing negative transfer in deep feature learning.
SAM: Semantic Attribute Modulation for Language Modeling and Style Variation
Hu, Wenbo, Hua, Lifeng, Li, Lei, Su, Hang, Wang, Tian, Chen, Ning, Zhang, Bo
This paper presents a Semantic Attribute Modulation (SAM) for language modeling and style variation. The semantic attribute modulation includes various document attributes, such as titles, authors, and document categories. We consider two types of attributes, (title attributes and category attributes), and a flexible attribute selection scheme by automatically scoring them via an attribute attention mechanism. The semantic attributes are embedded into the hidden semantic space as the generation inputs. With the attributes properly harnessed, our proposed SAM can generate interpretable texts with regard to the input attributes. Qualitative analysis, including word semantic analysis and attention values, shows the interpretability of SAM. On several typical text datasets, we empirically demonstrate the superiority of the Semantic Attribute Modulated language model with different combinations of document attributes. Moreover, we present a style variation for the lyric generation using SAM, which shows a strong connection between the style variation and the semantic attributes.
New Breakthroughs from DeepMind – Relational Networks and Visual Interaction Networks
Given enough GPUs, distributed machine learning systems (such as the one Facebook has published earlier this week) excel in recognizing and labeling images. These systems can quickly and accurately determine whether a dog is in the image, but struggle to answer relational questions. For example, a computer vision software cannot determine whether the dog in the picture is bigger than the ball it is playing with or the couch it is sitting on. While humans can reason about physical relationships between objects, computers have yet to make that connection until now. DeepMind, the creators of AlphaGo, quietly published two groundbreaking research papers into this area, demonstrating a way to train relational reasoning using deep neural networks.
Named Entity Recognition and the Road to Deep Learning
Not so very long ago, Natural Language Processing looked very different. In sequence labelling tasks such as Named Entity Recognition, Conditional Random Fields were the go-to model. The main challenge for NLP engineers consisted in finding good features that captured their data well. Today, deep learning has replaced CRFs at the forefront of sequence labelling, and the focus has shifted from feature engineering to designing and implementing effective neural network architectures. Still, the old and the new-style NLP are not diametrically opposed: just as it is possible (and useful!) to incorporate neural-network features into a CRF, CRFs have influenced some of the best deep learning models for sequence labelling.
Artificial Intelligence with Python: Prateek Joshi: 9781786464392: Amazon.com: Books
Prateek Joshi is an artificial intelligence researcher, published author of five books, and TEDx speaker. He is the founder of Pluto AI, a venture-funded Silicon Valley startup building an analytics platform for smart water management powered by deep learning. His work in this field has led to patents, tech demos, and research papers at major IEEE conferences. He has been an invited speaker at technology and entrepreneurship conferences including TEDx, AT&T Foundry, Silicon Valley Deep Learning, and Open Silicon Valley. Prateek has also been featured as a guest author in prominent tech magazines.