AITopics | Deep Learning

Collaborating Authors

Deep Learning

New computational algorithms make it possible to build neural networks with many input nodes and many layers, and distinguish "deep learning" of these networks from previous work on artificial neural nets.

News Overviews Instructional Materials AI-Alerts Classics

Intel, Apple Add to Artificial-Intelligence Deal Wave

WSJ.com: WSJD - TechnologyAug-14-2016, 00:37:39 GMT

Technology companies are hurriedly snapping up startups in the field of artificial intelligence, and Intel Corp. INTC -0.32 % is the latest to join a buying spree fueled by one of the hottest trends in the tech sector. The chip maker on Tuesday announced plans to pay an undisclosed amount for Nervana Systems, a 48-employee company working on semiconductors, software and services to exploit a popular AI technique called deep learning. Intel's move follows a deal disclosed Friday by Apple Inc. AAPL 0.23 % to purchase Turi Inc., a Seattle-based specialist in the field. The two acquisitions add to a string of 31 purchases since 2011 of AI startups by large companies, according to venture-capital research firm CB Insights. Factoring in smaller acquirers, PricewaterhouseCoopers LLP counts 29 related acquisitions so far this year, suggesting the total deal count for 2016 will top the 37 deals announced last year.

artificial intelligence, deep learning, machine learning, (13 more...)

WSJ.com: WSJD - Technology

Country:

North America > Canada > Ontario > Toronto (0.16)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > California > San Diego County > San Diego (0.05)
(2 more...)

Industry: Information Technology > Services (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.44)

Add feedback

Reinforcement Renaissance

Communications of the ACMAug-14-2016, 00:35:35 GMT

Based in San Francisco, Marina Krakovsky is the author of The Middleman Economy: How Brokers, Agents, Dealers, and Everyday Matchmakers Create Value and Profit (Palgrave Macmillan, 2015). Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Communications of the ACM

Country:

North America > United States > California > San Francisco County > San Francisco (0.25)
North America > Canada > Alberta (0.15)
North America > United States > Tennessee (0.05)
(5 more...)

Industry:

Information Technology (0.71)
Leisure & Entertainment > Games > Computer Games (0.48)
Leisure & Entertainment > Games > Go (0.30)
Leisure & Entertainment > Games > Backgammon (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Games (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Recurrent Fully Convolutional Neural Networks for Multi-slice MRI Cardiac Segmentation

Poudel, Rudra P K, Lamata, Pablo, Montana, Giovanni

arXiv.org Machine LearningAug-13-2016

In cardiac magnetic resonance imaging, fully-automatic segmentation of the heart enables precise structural and functional measurements to be taken, e.g. from short-axis MR images of the left-ventricle. In this work we propose a recurrent fully-convolutional network (RFCN) that learns image representations from the full stack of 2D slices and has the ability to leverage inter-slice spatial dependences through internal memory units. RFCN combines anatomical detection and segmentation into a single architecture that is trained end-to-end thus significantly reducing computational time, simplifying the segmentation pipeline, and potentially enabling real-time applications. We report on an investigation of RFCN using two datasets, including the publicly available MICCAI 2009 Challenge dataset. Comparisons have been carried out between fully convolutional networks and deep restricted Boltzmann machines, including a recurrent version that leverages inter-slice spatial correlation. Our studies suggest that RFCN produces state-of-the-art results and can substantially improve the delineation of contours near the apex of the heart.

artificial intelligence, machine learning, segmentation, (18 more...)

arXiv.org Machine Learning

1608.03974

Country: Europe (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Tutorial on Variational Autoencoders

Doersch, Carl

arXiv.org Machine LearningAug-13-2016

In just three years, Variational Autoencoders (VAEs) have emerged as one of the most popular approaches to unsupervised learning of complicated distributions. VAEs are appealing because they are built on top of standard function approximators (neural networks), and can be trained with stochastic gradient descent. VAEs have already shown promise in generating many kinds of complicated data, including handwritten digits, faces, house numbers, CIFAR images, physical models of scenes, segmentation, and predicting the future from static images. This tutorial introduces the intuitions behind VAEs, explains the mathematics behind them, and describes some empirical behavior. No prior knowledge of variational Bayesian methods is assumed.

artificial intelligence, autoencoder, machine learning, (18 more...)

arXiv.org Machine Learning

1606.05908

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Faster Training of Very Deep Networks Via p-Norm Gates

Pham, Trang, Tran, Truyen, Phung, Dinh, Venkatesh, Svetha

arXiv.org Machine LearningAug-11-2016

A major contributing factor to the recent advances in deep neural networks is structural units that let sensory information and gradients to propagate easily. Gating is one such structure that acts as a flow control. Gates are employed in many recent state-of-the-art recurrent models such as LSTM and GRU, and feedforward models such as Residual Nets and Highway Networks. This enables learning in very deep networks with hundred layers and helps achieve record-breaking results in vision (e.g., ImageNet with Residual Nets) and NLP (e.g., machine translation with GRU). However, there is limited work in analysing the role of gating in the learning process. In this paper, we propose a flexible $p$-norm gating scheme, which allows user-controllable flow and as a consequence, improve the learning speed. This scheme subsumes other existing gating schemes, including those in GRU, Highway Networks and Residual Nets as special cases. Experiments on large sequence and vector datasets demonstrate that the proposed gating scheme helps improve the learning speed significantly without extra overhead.

artificial intelligence, highway network, machine learning, (18 more...)

arXiv.org Machine Learning

1608.03639

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Temporal Learning and Sequence Modeling for a Job Recommender System

Liu, Kuan, Shi, Xing, Kumar, Anoop, Zhu, Linhong, Natarajan, Prem

arXiv.org Machine LearningAug-10-2016

We present our solution to the job recommendation task for RecSys Challenge 2016. The main contribution of our work is to combine temporal learning with sequence modeling to capture complex user-item activity patterns to improve job recommendations. First, we propose a time-based ranking model applied to historical observations and a hybrid matrix factorization over time re-weighted interactions. Second, we exploit sequence properties in user-items activities and develop a RNN-based recommendation model. Our solution achieved 5$^{th}$ place in the challenge among more than 100 participants. Notably, the strong performance of our RNN approach shows a promising new direction in employing sequence modeling for recommendation systems.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Machine Learning

doi: 10.1145/2987538.2987540

1608.03333

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A deep language model for software code

Dam, Hoa Khanh, Tran, Truyen, Pham, Trang

arXiv.org Machine LearningAug-9-2016

Existing language models such as n-grams for software code often fail to capture a long context where dependent code elements scatter far apart. In this paper, we propose a novel approach to build a language model for software code to address this particular issue. Our language model, partly inspired by human memory, is built upon the powerful deep learning-based Long Short Term Memory architecture that is capable of learning long-term dependencies which occur frequently in software code. Results from our intrinsic evaluation on a corpus of Java projects have demonstrated the effectiveness of our language model. This work contributes to realizing our vision for DeepSoft, an end-to-end, generic deep learning-based framework for modeling software and its development process.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1608.02715

Country: North America > United States (0.30)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards Representation Learning with Tractable Probabilistic Models

Vergari, Antonio, Di Mauro, Nicola, Esposito, Floriana

arXiv.org Machine LearningAug-8-2016

Probabilistic models learned as density estimators can be exploited in representation learning beside being toolboxes used to answer inference queries only. However, how to extract useful representations highly depends on the particular model involved. We argue that tractable inference, i.e. inference that can be computed in polynomial time, can enable general schemes to extract features from black box models. We plan to investigate how Tractable Probabilistic Models (TPMs) can be exploited to generate embeddings by random query evaluations. We devise two experimental designs to assess and compare different TPMs as feature extractors in an unsupervised representation learning framework. We show some experimental results on standard image datasets by applying such a method to Sum-Product Networks and Mixture of Trees as tractable models generating embeddings.

artificial intelligence, evaluation, machine learning, (17 more...)

arXiv.org Machine Learning

1608.02341

Country:

Europe (0.46)
North America > United States (0.29)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)

Add feedback

Bridging the Gap between Stochastic Gradient MCMC and Stochastic Optimization

Chen, Changyou, Carlson, David, Gan, Zhe, Li, Chunyuan, Carin, Lawrence

arXiv.org Machine LearningAug-5-2016

Stochastic gradient Markov chain Monte Carlo (SG-MCMC) methods are Bayesian analogs to popular stochastic optimization methods; however, this connection is not well studied. We explore this relationship by applying simulated annealing to an SGMCMC algorithm. Furthermore, we extend recent SG-MCMC methods with two key components: i) adaptive preconditioners (as in ADAgrad or RMSprop), and ii) adaptive element-wise momentum weights. The zero-temperature limit gives a novel stochastic optimization method with adaptive element-wise momentum weights, while conventional optimization methods only have a shared, static momentum weight. Under certain assumptions, our theoretical analysis suggests the proposed simulated annealing approach converges close to the global optima. Experiments on several deep neural network models show state-of-the-art results compared to related stochastic optimization algorithms.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1512.07962

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Programmer: Inducing Latent Programs with Gradient Descent

Neelakantan, Arvind, Le, Quoc V., Sutskever, Ilya

arXiv.org Machine LearningAug-4-2016

Deep neural networks have achieved impressive supervised classification performance in many tasks including image recognition, speech recognition, and sequence to sequence learning. However, this success has not been translated to applications like question answering that may involve complex arithmetic and logic reasoning. A major limitation of these models is in their inability to learn even simple arithmetic and logic operations. For example, it has been shown that neural networks fail to learn to add two binary numbers reliably. In this work, we propose Neural Programmer, an end-to-end differentiable neural network augmented with a small set of basic arithmetic and logic operations. Neural Programmer can call these augmented operations over several steps, thereby inducing compositional programs that are more complex than the built-in operations. The model learns from a weak supervision signal which is the result of execution of the correct program, hence it does not require expensive annotation of the correct program itself. The decisions of what operations to call, and what data segments to apply to are inferred by Neural Programmer. Such decisions, during training, are done in a differentiable fashion so that the entire network can be trained jointly by gradient descent. We find that training the model is difficult, but it can be greatly improved by adding random noise to the gradient. On a fairly complex synthetic table-comprehension dataset, traditional recurrent networks and attentional models perform poorly while Neural Programmer typically obtains nearly perfect accuracy.

artificial intelligence, machine learning, opération, (16 more...)

arXiv.org Machine Learning

1511.04834

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.70)

Add feedback