Goto

Collaborating Authors

 Deep Learning


You Should Know These 20 Technology Leaders Driving China's A.I. Revolution - TOPBOTS

#artificialintelligence

China's leading technology companies are on fire, heavily investing in artificial intelligence and building true global presences. McKinsey recently reported that academic and research institutions in the country publish more cited research papers than the US, UK, or any other global leader in AI, producing nearly 10,000 papers in 2015 alone. Backed by strong government mandates and billions of dollars of both private and public investments, China is challenging the US for position of global AI leader. Fearful of competition, the US government is considering placing restrictions on Chinese investments in AI and technology in the United States. In many sectors, such as healthcare, China may already be ahead of America in applying AI to critical public issues.


[R] [1707.02968] Revisiting Unreasonable Effectiveness of Data in Deep Learning Era โ€ข r/MachineLearning

@machinelearnbot

The improvement chart looks nice. And they note the slope is probably steeper than it looks because they didn't train the models to convergence nor did a hyperparameter search. But on the other hand, that in a way answers their question. This paper used 50 K80 GPUs for 2 months and they still couldn't train a 101-layer Resnet model to convergence, much less do hyperparameter search or experiment with the 1000-layer Resnets or Densenets or attention or all the other fun things you can do with cutting edge CNNs. If a Google/CMU team with that much computational resources can't make good use of 300M images, why does anyone anywhere need that dataset?


The Era of AI Computing - Fedscoop

#artificialintelligence

At GTC, we unveiled Volta, our greatest generational leap since the invention of CUDA. It incorporates 21 billion transistors. It includes the fastest HBM memories from Samsung. Volta features a new numeric format and CUDA instruction that perform 4 4 matrix operationsโ€“an elemental deep learning operationโ€“at super-high speeds.


DeepCodec: Adaptive Sensing and Recovery via Deep Convolutional Neural Networks

arXiv.org Machine Learning

In this paper we develop a novel computational sensing framework for sensing and recovering structured signals. When trained on a set of representative signals, our framework learns to take undersampled measurements and recover signals from them using a deep convolutional neural network. In other words, it learns a transformation from the original signals to a near-optimal number of undersampled measurements and the inverse transformation from measurements to signals. This is in contrast to traditional compressive sensing (CS) systems that use random linear measurements and convex optimization or iterative algorithms for signal recovery. We compare our new framework with $\ell_1$-minimization from the phase transition point of view and demonstrate that it outperforms $\ell_1$-minimization in the regions of phase transition plot where $\ell_1$-minimization cannot recover the exact solution. In addition, we experimentally demonstrate how learning measurements enhances the overall recovery performance, speeds up training of recovery framework, and leads to having fewer parameters to learn.


Spectral Ergodicity in Deep Learning Architectures via Surrogate Random Matrices

arXiv.org Machine Learning

In this work a novel method to quantify spectral ergodicity for random matrices is presented. The new methodology combines approaches rooted in the metrics of Thirumalai-Mountain (TM) and Kullbach-Leibler (KL) divergence. The method is applied to a general study of deep and recurrent neural networks via the analysis of random matrix ensembles mimicking typical weight matrices of those systems. In particular, we examine circular random matrix ensembles: circular unitary ensemble (CUE), circular orthogonal ensemble (COE), and circular symplectic ensemble (CSE). Eigenvalue spectra and spectral ergodicity are computed for those ensembles as a function of network size. It is observed that as the matrix size increases the level of spectral ergodicity of the ensemble rises, i.e., the eigenvalue spectra obtained for a single realisation at random from the ensemble is closer to the spectra obtained averaging over the whole ensemble. Based on previous results we conjecture that success of deep learning architectures is strongly bound to the concept of spectral ergodicity. The method to compute spectral ergodicity proposed in this work could be used to optimise the size and architecture of deep as well as recurrent neural networks.


Deep Learning-Based Communication Over the Air

arXiv.org Machine Learning

End-to-end learning of communications systems is a fascinating novel concept that has so far only been validated by simulations for block-based transmissions. It allows learning of transmitter and receiver implementations as deep neural networks (NNs) that are optimized for an arbitrary differentiable end-to-end performance metric, e.g., block error rate (BLER). In this paper, we demonstrate that over-the-air transmissions are possible: We build, train, and run a complete communications system solely composed of NNs using unsynchronized off-the-shelf software-defined radios (SDRs) and open-source deep learning (DL) software libraries. We extend the existing ideas towards continuous data transmission which eases their current restriction to short block lengths but also entails the issue of receiver synchronization. We overcome this problem by introducing a frame synchronization module based on another NN. A comparison of the BLER performance of the "learned" system with that of a practical baseline shows competitive performance close to 1 dB, even without extensive hyperparameter tuning. We identify several practical challenges of training such a system over actual channels, in particular the missing channel gradient, and propose a two-step learning procedure based on the idea of transfer learning that circumvents this issue.


Google's DeepMind uses reinforcement learning to master parkour

#artificialintelligence

Google has taught its DeepMind AI to navigate a parkour course by using reinforcement learning. Reinforcement learning is the practice of rewarding desirable behaviour. The faster the AI could navigate the virtual parkour course, the greater the reward. Further incentives and penalties were added for various other metrics. "We train several simulated bodies on a diverse set of challenging terrains and obstacles, using a simple reward function based on forward progress," explains Nicolas Heess, a researcher on the project.


Big Data Analytics Methods: Modern Analytics Techniques for the 21st Century: The Data Scientist's Manual to Data Mining, Deep Learning & Natural Language Processing: Peter Ghavami: 9781530414833: Amazon.com: Books

@machinelearnbot

Big Data Analytics from Dr. Peter Ghavami is one of the most comprehensive and best manuals I've read so far. It's from a practitioner for us practitioners who take our advancement in big data analytics serious. It's a manual that helped me solving urgent analytical problems and generating the impact needed to convince senior deciders for correcting their course of action. Peter leads us through the chapters with simple yet compelling words, having the clear goal in mind, to help us improve in practice. Thereby he teaches effectively by following a path most likely most studies go.


Google, IBM look to mimic the human brain

#artificialintelligence

Several years ago, there were reports that an IBM artificial intelligence (AI) project had mimicked the brain of a cat. Being the smartass that I am, I responded on Twitter with, "You mean it spends 18 hours a day in sleep mode?" That report was later debunked, but the effort to simulate the brain continues, using new types of processors far faster and more brain-like than your standard x86 processor. IBM and the U.S. Air Force have announced one such project, while Google has its own. Researchers from Google and the University of Toronto last month released an academic paper titled "One Model To Learn Them All," and they were pretty quiet about it.


Benchmarking TensorFlow on Cloud CPUs: Cheaper Deep Learning than Cloud GPUs

@machinelearnbot

I've been working on a few personal deep learning projects with Keras and TensorFlow. However, training models for deep learning with cloud services such as Amazon EC2 and Google Compute Engine isn't free, and as someone who is currently unemployed, I have to keep an eye on extraneous spending and be as cost-efficient as possible (please support my work on Patreon!). I tried deep learning on the cheaper CPU instances instead of GPU instances to save money, and to my surprise, my model training was only slightly slower. As a result, I took a deeper look at the pricing mechanisms of these two types of instances to see if CPUs are more useful for my needs. The pricing of GPU instances on Google Compute Engine starts at $0.745/hr (by attaching a $0.700/hr