"Many researchers … speculate that the information-processing abilities of biological neural systems must follow from highly parallel processes operating on representations that are distributed over many neurons. [Artificial neural networks] capture this kind of highly parallel computation based on distributed representations"
– from Machine Learning (Section 4.1.1; page 82) by Tom M. Mitchell, McGraw Hill Companies, Inc. (1997).
Even those new to IT have probably heard that everyone is "moving to the cloud." This transition from standard infrastructure is thanks in large part to Amazon Web Services. Currently, AWS offers "over 90 fully featured services for computing, storage, networking, analytics, application services, d...
We decompose the evidence lower bound to show the existence of a term measuring the total correlation between latent variables. We use this to motivate our $\beta$-TCVAE (Total Correlation Variational Autoencoder), a refinement of the state-of-the-art $\beta$-VAE objective for learning disentangled representations, requiring no additional hyperparameters during training. We further propose a principled classifier-free measure of disentanglement called the mutual information gap (MIG). We perform extensive quantitative and qualitative experiments, in both restricted and non-restricted settings, and show a strong relation between total correlation and disentanglement, when the latent variables model is trained using our framework.
The receptive field is perhaps one of the most important concepts in Convolutional Neural Networks (CNNs) that deserves more attention from the literature. All of the state-of-the-art object recognition methods design their model architectures around this idea. However, to my best knowledge, current...
Deeplearning4j's Github repository has many examples to cover its functionality. The Quick Start Guide shows you how to set up Intellij and clone the repository. This page provides an overview of some of those examples. Most of the examples make use of DataVec, a toolkit for preprocessing and clearning data through normalization, standardization, search and replace, column shuffles and vectorization. Reading raw data and transforming it into a DataSet object for your Neural Network is often the first step toward training that network.
I recently joined Google to edit this blog, and to explore the value of machine learning and big data in an intuitive and hands-on manner. Over the past couple years, I've been fortunate to work with engineers who design and tune ML algorithms, and I've even trained my own models on a couple occasions. But since joining Google, I've been truly humbled by the techniques, code, and expertise of the software engineers, product managers, customer engineers, solutions architects, and developer advocates within Google Cloud. Not to mention the venerable researchers who sit on DeepMind and all Google AI teams. Some of the most capable minds in the world dedicate every working moment to machine learning: the art and science enabling computers to make increasingly sophisticated analyses.
Deep Learning ultimately is about finding a minimum that generalizes well -- with bonus points for finding one fast and reliably. Our workhorse, stochastic gradient descent (SGD), is a 60-year old algorithm (Robbins and Monro, 1951) , that is as essential to the current generation of Deep Learning algorithms as back-propagation. Different optimization algorithms have been proposed in recent years, which use different equations to update a model's parameters. Adam (Kingma and Ba, 2015)  was introduced in 2015 and is arguably today still the most commonly used one of these algorithms. This indicates that from the Machine Learning practitioner's perspective, best practices for optimization for Deep Learning have largely remained the same.
In this article, I'll talk about Generative Adversarial Networks, or GANs for short. GANs are one of the very few machine learning techniques which has given good performance for generative tasks, or more broadly unsupervised learning. In particular, they have given splendid performance for a variety of image generation related tasks. Yann LeCun, one of the forefathers of deep learning, has called them "the best idea in machine learning in the last 10 years". Most importantly, the core conceptual ideas associated with a GAN are quite simple to understand (and in-fact, you should have a good idea about them by the time you finish reading this article).
The increased availability of data and recent advancements in artificial intelligence present the unprecedented opportunities in healthcare and major challenges for the patients, developers, providers and regulators. The novel deep learning and transfer learning techniques are turning any data about the person into medical data transforming simple facial pictures and videos into powerful sources of data for predictive analytics. Presently, the patients do not have control over the access privileges to their medical records and remain unaware of the true value of the data they have. In this paper, we provide an overview of the next-generation artificial intelligence and blockchain technologies and present innovative solutions that may be used to accelerate the biomedical research and enable patients with new tools to control and profit from their personal data as well with the incentives to undergo constant health monitoring. We introduce new concepts to appraise and evaluate personal records, including the combination-, time- and relationship-value of the data.
Machine learning solutions, in particular those based on deep learning methods, form an underpinning of the current revolution in "artificial intelligence" that has dominated popular press headlines and is having a significant influence on the wider tech agenda. In this talk I will give an overview of where we are now with machine learning solutions, and what challenges we face both in the near and far future. These include practical application of existing algorithms in the face of the need to explain decision making, mechanisms for improving the quality and availability of data, dealing with large unstructured datasets.
"I am still working on this, but now many more people are interested. Because the methods we've created on the way to this goal are now permeating the modern world--available to half of humankind, used billions of times per day." "As of August 2017, the five most valuable public companies in existence are Apple, Google, Microsoft, Facebook and Amazon. All of them are heavily using the deep-learning neural networks developed in my labs in Germany and Switzerland since the early 1990s--in particular, the Long Short-Term Memory network, or LSTM, described in several papers with my colleagues Sepp Hochreiter, Felix Gers, Alex Graves and other brilliant students and postdocs funded by European taxpayers.In the beginning, such an LSTM is stupid. But it can learn through experience.