Deep Learning
6 Deep Learning Techniques They Never Taught You In School
We all remember Maxim Gorky, Rabindranath Tagore, Ernest Hemingway, James Watt, Thomas Alva Edison, Leonardo da Vinci, and the Wright Brothers as some of the few people who left their mark in the world but do you also know that all of them were partially or wholly self-taught. They did not rely completely on the education that school had to offer. These kinds of people are called autodidacts. Here is a list of such people, try and see how many you already know. It was very beautifully penned down by the American writer, humorist, entrepreneur, Mark Twain as "I have never let my schooling interfere with my education."
[D]Implementing a Fuzzy Restricted Boltzmann Machine โข r/MachineLearning
Hello, I suspect this isn't the right subreddit for this kind of thing, bu MLQuestions is really quiet. I'm trying to implement a FRBM based on these papers: Transactions on Fuzzy Systems 1 A Fuzzy Restricted Boltzmann Machine: Novel Learning Algorithms Based on Crisp Possibilistic Mean Value of Fuzzy Numbers, pages 5 and 7, and Fuzzy Restricted Boltzmann Machine for the Enhancement of Deep Learning, page 6. I am looking for a recommendation for a good starting implementation of an RBM that can be modified in order to acomplish this. Is there a framework with an implementation that can be adapted, or any (easy to read) code in Python or MatLab.
Deconstructing Deep Meta Learning โ Intuition Machine โ Medium
This article explores in more detail the idea of Meta Learning that was previously introduced in a post "The Meta Model and Meta Meta Model of Deep Learning". In this post, I explore "Learning to Learn" as a Meta Learning approach. We have to be very careful to distinguish between Learning to Learn and Hyper Parameter Optimization (HPO). HPO and more generally searching for architectures differs from "learning to learn" in that that HPO explores the space of architectures while meta-learning explores the space of learning algorithms. Meta-learning is all the rage in research these days.
Human and Computer Partnership: Challenging Artificial Intelligence Myths
Along with movies, the big hype in AI technology that we currently see in the media, perpetuated by marketing campaigns, is responsible for people's fears that AI will one day make them functionally redundant. For many companies, the attention generated by this hype is too appealing to avoid using the'artificial intelligence' buzzword. We can see that the number of businesses who have added '.ai' to their URL is on the increase, showing that they are taking advantage of the hype. If we take a look at some of the language used by the biggest AI companies, it is easy to see where this hype comes from. For example, Google Deepmind describes AlphaGo as having'overturned hundreds of years of received wisdom' for the game of Go and apparently since the match has continued to'surprise and amaze'.
Learning Machine Learning
Machine learning is a hot topic for developers, but where can one learn about how to use the technology? A lot depends on your current background and your long-term goals. I have already written about the basic differences between machine-learning techniques, but this was done at a relatively high level. Getting into the details can range from learning about machine-learning methodologies at an abstract level to examining deep-learning frameworks used to develop applications. Here, we'll take a more detailed look at some of the online resources available to you, and include links to websites with much more information about machine-learning classes, frameworks, and resources.
Predicting multicellular function through multi-layer tissue networks
Zitnik, Marinka, Leskovec, Jure
Motivation: Understanding functions of proteins in specific human tissues is essential for insights into disease diagnostics and therapeutics, yet prediction of tissue-specific cellular function remains a critical challenge for biomedicine. Results: Here we present OhmNet, a hierarchy-aware unsupervised node feature learning approach for multi-layer networks. We build a multi-layer network, where each layer represents molecular interactions in a different human tissue. OhmNet then automatically learns a mapping of proteins, represented as nodes, to a neural embedding based low-dimensional space of features. OhmNet encourages sharing of similar features among proteins with similar network neighborhoods and among proteins activated in similar tissues. The algorithm generalizes prior work, which generally ignores relationships between tissues, by modeling tissue organization with a rich multiscale tissue hierarchy. We use OhmNet to study multicellular function in a multi-layer protein interaction network of 107 human tissues. In 48 tissues with known tissue-specific cellular functions, OhmNet provides more accurate predictions of cellular function than alternative approaches, and also generates more accurate hypotheses about tissue-specific protein actions. We show that taking into account the tissue hierarchy leads to improved predictive power. Remarkably, we also demonstrate that it is possible to leverage the tissue hierarchy in order to effectively transfer cellular functions to a functionally uncharacterized tissue. Overall, OhmNet moves from flat networks to multiscale models able to predict a range of phenotypes spanning cellular subsystems
GLSR-VAE: Geodesic Latent Space Regularization for Variational AutoEncoder Architectures
Hadjeres, Gaรซtan, Nielsen, Frank, Pachet, Franรงois
VAEs (Variational AutoEncoders) have proved to be powerful in the context of density modeling and have been used in a variety of contexts for creative purposes. In many settings, the data we model possesses continuous attributes that we would like to take into account at generation time. We propose in this paper GLSR-VAE, a Geodesic Latent Space Regularization for the Variational AutoEncoder architecture and its generalizations which allows a fine control on the embedding of the data into the latent space. When augmenting the VAE loss with this regularization, changes in the learned latent space reflects changes of the attributes of the data. This deeper understanding of the VAE latent space structure offers the possibility to modulate the attributes of the generated data in a continuous way. We demonstrate its efficiency on a monophonic music generation task where we manage to generate variations of discrete sequences in an intended and playful way.
f-GANs in an Information Geometric Nutshell
Nock, Richard, Cranko, Zac, Menon, Aditya Krishna, Qu, Lizhen, Williamson, Robert C.
Nowozin \textit{et al} showed last year how to extend the GAN \textit{principle} to all $f$-divergences. The approach is elegant but falls short of a full description of the supervised game, and says little about the key player, the generator: for example, what does the generator actually converge to if solving the GAN game means convergence in some space of parameters? How does that provide hints on the generator's design and compare to the flourishing but almost exclusively experimental literature on the subject? In this paper, we unveil a broad class of distributions for which such convergence happens --- namely, deformed exponential families, a wide superset of exponential families --- and show tight connections with the three other key GAN parameters: loss, game and architecture. In particular, we show that current deep architectures are able to factorize a very large number of such densities using an especially compact design, hence displaying the power of deep architectures and their concinnity in the $f$-GAN game. This result holds given a sufficient condition on \textit{activation functions} --- which turns out to be satisfied by popular choices. The key to our results is a variational generalization of an old theorem that relates the KL divergence between regular exponential families and divergences between their natural parameters. We complete this picture with additional results and experimental insights on how these results may be used to ground further improvements of GAN architectures, via (i) a principled design of the activation functions in the generator and (ii) an explicit integration of proper composite losses' link function in the discriminator.
Smart Content Recognition from Images Using a Mixture of Convolutional Neural Networks
Connie, Tee, Al-Shabi, Mundher, Goh, Michael
With rapid development of the Internet, web contents become huge. Most of the websites are publicly available, and anyone can access the contents from anywhere such as workplace, home and even schools. Nevertheless, not all the web contents are appropriate for all users, especially children. An example of these contents is pornography images which should be restricted to certain age group. Besides, these images are not safe for work (NSFW) in which employees should not be seen accessing such contents during work. Recently, convolutional neural networks have been successfully applied to many computer vision problems. Inspired by these successes, we propose a mixture of convolutional neural networks for adult content recognition. Unlike other works, our method is formulated on a weighted sum of multiple deep neural network models. The weights of each CNN models are expressed as a linear regression problem learned using Ordinary Least Squares (OLS). Experimental results demonstrate that the proposed model outperforms both single CNN model and the average sum of CNN models in adult content recognition.
Bilateral Multi-Perspective Matching for Natural Language Sentences
Wang, Zhiguo, Hamza, Wael, Florian, Radu
Natural language sentence matching is a fundamental technology for a variety of tasks. Previous approaches either match sentences from a single direction or only apply single granular (word-by-word or sentence-by-sentence) matching. In this work, we propose a bilateral multi-perspective matching (BiMPM) model under the "matching-aggregation" framework. Given two sentences $P$ and $Q$, our model first encodes them with a BiLSTM encoder. Next, we match the two encoded sentences in two directions $P \rightarrow Q$ and $P \leftarrow Q$. In each matching direction, each time step of one sentence is matched against all time-steps of the other sentence from multiple perspectives. Then, another BiLSTM layer is utilized to aggregate the matching results into a fix-length matching vector. Finally, based on the matching vector, the decision is made through a fully connected layer. We evaluate our model on three tasks: paraphrase identification, natural language inference and answer sentence selection. Experimental results on standard benchmark datasets show that our model achieves the state-of-the-art performance on all tasks.