Goto

Collaborating Authors

 Deep Learning


maciejkula/spotlight

#artificialintelligence

Large embedding layers are a performance problem for fitting models: even though the gradients are sparse (only a handful of user and item vectors need parameter updates in every minibatch), PyTorch updates the entire embedding layer at every backward pass. Computation time is then wasted on applying zero gradient steps to whole embedding matrix. To alleviate this problem, we can use a smaller underlying embedding layer, and probabilistically hash users and items into that smaller space. With good hash functions, collisions should be rare, and we should observe fitting speedups without a decrease in accuracy. The implementation in Spotlight follows the RecSys 2017 paper "Getting deep recommenders fit: Bloom embeddings for sparse binary input/output networks.".


What Are The Differences Between AI, Machine Learning, NLP, And Deep Learning?

#artificialintelligence

What is the difference between AI, Machine Learning, NLP, and Deep Learning? AI (Artificial intelligence) is a subfield of computer science that was created in the 1960s, and it was/is concerned with solving tasks that are easy for humans but hard for computers. In particular, a so-called Strong AI would be a system that can do anything a human can (perhaps without purely physical things). This is fairly generic and includes all kinds of tasks such as planning, moving around in the world, recognizing objects and sounds, speaking, translating, performing social or business transactions, creative work (making art or poetry), etc. NLP (Natural language processing) is simply the part of AI that has to do with language (usually written). Machine learning is concerned with one aspect of this: given some AI problem that can be described in discrete terms (e.g.


Top 5 Deep Learning and AI Stories - September 8 2017

#artificialintelligence

READ ARTICLE 7. MASS GEN HOSPITAL GETS WORLD'S FIRST VOLTA AI SUPERCOMPUTERS The research team at the Center for Clinical Data Science founded by Massachusetts General Hospital has just received the world's first AI supercomputer: "CCDS will build upon its groundbreaking research to develop a host of new training algorithms and bring the power of AI directly to doctors. "The trained neural networks residing on DGX-1 systems in CCDS's data center are in a constant state of learning, continually ingesting countless medical images worldwide." READ ARTICLE 9. UC BERKELEY'S SERGEY LEVINE EXPLAINS HOW DEEP LEARNING WILL UNLEASH ROBOTICS Sergey Levine tries to answer the question, "How do you teach a robot to learn?" "One of the most important things is that you have to somehow communicate to the robot what it means to succeedโ€ฆIt's not easy, though. Teaching a robot to learn, instead of to just recognize images, is more difficult than building a deep learning system that can recognize images. Artificial Intelligence Advances to Improve Construction 2. How to Regulate Artificial Intelligence 3. Massachusetts General Hospital Gets World's First Volta AI Supercomputers 4. GE Venture Avitas Systems Uses AI to Help Robots Spot Defects 5. UC Berkeley's Sergey Levine Explains How Deep Learning Will Unleash Robotics "The trained neural networks residing on DGX-1 systems in CCDS's data center are in a constant state of learning, continually ingesting countless medical images worldwide."


Classifying Unordered Feature Sets with Convolutional Deep Averaging Networks

arXiv.org Machine Learning

We propose convolutional deep averaging networks (CDANs) for classifying and learning feature representations of datasets containing instances with unordered features, where each feature is considered a tuple composed of one or more values. CDANs accept variable-size input and are invariant to permutations of the input's order. In addition, as a side-effect of the training process, CDANs learn discriminative, nonlinear embeddings of individual input elements into a space of chosen dimensionality. Contrary to their name, which is inspired by the work of Iyyer et al. [11], CDANs could perhaps be more accurately termed convolutional deep pooling networks as we also consider the effects of functions other than averaging such as taking element-wise maximums or sums. A. Contributions We propose CDANs for classifying unordered feature sets. We show that a CDAN with nonlinear embeddings is competitive with and perhaps even superior to recurrent neural networks (RNNs) and known permutation-invariant architectures for classifying instances containing variablesize sets of unordered features. We also find that the type of pooling plays a significant role in determining the efficacy of the network with sum-pooling clearly outperforming maxand average-pooling.


Complex spectrogram enhancement by convolutional neural network with multi-metrics learning

arXiv.org Machine Learning

This paper aims to address two issues existing in the current speech enhancement methods: 1) the difficulty of phase estimations; 2) a single objective function cannot consider multiple metrics simultaneously. To solve the first problem, we propose a novel convolutional neural network (CNN) model for complex spectrogram enhancement, namely estimating clean real and imaginary (RI) spectrograms from noisy ones. The reconstructed RI spectrograms are directly used to synthesize enhanced speech waveforms. In addition, since log-power spectrogram (LPS) can be represented as a function of RI spectrograms, its reconstruction is also considered as another target. Thus a unified objective function, which combines these two targets (reconstruction of RI spectrograms and LPS), is equivalent to simultaneously optimizing two commonly used objective metrics: segmental signal-to-noise ratio (SSNR) and logspectral distortion (LSD). Therefore, the learning process is called multi-metrics learning (MML). Experimental results confirm the effectiveness of the proposed CNN with RI spectrograms and MML in terms of improved standardized evaluation metrics on a speech enhancement task.


Deep Residual Networks and Weight Initialization

arXiv.org Machine Learning

Residual Network (ResNet) is the state-of-the-art architecture that realizes successful training of really deep neural network. It is also known that good weight initialization of neural network avoids problem of vanishing/exploding gradients. In this paper, simplified models of ResNets are analyzed. We argue that goodness of ResNet is correlated with the fact that ResNets are relatively insensitive to choice of initial weights. We also demonstrate how batch normalization improves backpropagation of deep ResNets without tuning initial values of weights.


tensorflow/agents

@machinelearnbot

This project provides optimized infrastructure for reinforcement learning. It extends the OpenAI gym interface to multiple parallel environments and allows agents to be implemented in TensorFlow and perform batched computation. As a starting point, we provide BatchPPO, an optimized implementation of Proximal Policy Optimization. The algorithm to use is defined in the configuration and pendulum started here uses the included PPO implementation. Check out more pre-defined configurations in agents/scripts/configs.py.


Cognitive Toolkit Model Evaluation in UWP - Building Apps for Windows

#artificialintelligence

We are excited to share with you that Microsoft Cognitive Toolkit (CNTK) 2.1 has added support for model evaluation on UWP applications. This means you can harness the power of deep learning in your Windows apps delivered via the Windows Store! Read on to find out how can infuse your apps with the power of AI. Cloud-connected devices can perform operations locally or delegate them to the cloud. The virtually unlimited compute power of the cloud makes it a good choice for running tasks that need significant compute power but don't require low latency.


Top /r/MachineLearning Posts, August: Andrew Ng is back at it; Reinforcement Learning makes a splash; Fixing your ANN

#artificialintelligence

No doubt you have heard about it by now. Above is the link to the Reddit discussion, while this is the link to the Coursera specialization. So much to study, so little time!! Testing our agents in games that are not specifically designed for AI research, and where humans play well, is crucial to benchmark agent performance. That is why we, along with our partner Blizzard Entertainment, are excited to announce the release of SC2LE, a set of tools that we hope will accelerate AI research in the real-time strategy game StarCraft II. This includes an API for machine learning which hooks into a given game, a dataset of anonymized game replays (increasing to 500K in the coming weeks), and an open source version of PySC2, DeepMind's toolset.


A practical guide to machine learning in business

@machinelearnbot

Machine learning is transforming business. But even as the technology advances, companies still struggle to take advantage of it, largely because they don't understand how to strategically implement machine learning in service of business goals. Hype hasn't helped, sowing confusion over what exactly machine learning is, how well it works and what it can do for your company. Here, we provide a clear-eyed look at what machine learning is and how it can be used today. Machine learning is a subset of artificial intelligence that enables systems to learn and predict outcomes without explicit programming.