Goto

Collaborating Authors

 Deep Learning


Recurrent Latent Variable Networks for Session-Based Recommendation

arXiv.org Machine Learning

In this work, we attempt to ameliorate the impact of data sparsity in the context of session-based recommendation. Specifically, we seek to devise a machine learning mechanism capable of extracting subtle and complex underlying temporal dynamics in the observed session data, so as to inform the recommendation algorithm. To this end, we improve upon systems that utilize deep learning techniques with recurrently connected units; we do so by adopting concepts from the field of Bayesian statistics, namely variational inference. Our proposed approach consists in treating the network recurrent units as stochastic latent variables with a prior distribution imposed over them. On this basis, we proceed to infer corresponding posteriors; these can be used for prediction and recommendation generation, in a way that accounts for the uncertainty in the available sparse training data. To allow for our approach to easily scale to large real-world datasets, we perform inference under an approximate amortized variational inference (AVI) setup, whereby the learned posteriors are parameterized via (conventional) neural networks. We perform an extensive experimental evaluation of our approach using challenging benchmark datasets, and illustrate its superiority over existing state-of-the-art techniques.


A Supervised Approach to Extractive Summarisation of Scientific Papers

arXiv.org Machine Learning

Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on neural approaches to summarisation, which can be very data-hungry. However, few large datasets exist and none for the traditionally popular domain of scientific publications, which opens up challenging research avenues centered on encoding large, complex documents. In this paper, we introduce a new dataset for summarisation of computer science publications by exploiting a large resource of author provided summaries and show straightforward ways of extending it further. We develop models on the dataset making use of both neural sentence encoding and traditionally used summarisation features and show that models which encode sentences as well as their local and global context perform best, significantly outperforming well-established baseline methods.


Practical Gauss-Newton Optimisation for Deep Learning

arXiv.org Machine Learning

We present an efficient block-diagonal ap- proximation to the Gauss-Newton matrix for feedforward neural networks. Our result- ing algorithm is competitive against state- of-the-art first order optimisation methods, with sometimes significant improvement in optimisation performance. Unlike first-order methods, for which hyperparameter tuning of the optimisation parameters is often a labo- rious process, our approach can provide good performance even when used with default set- tings. A side result of our work is that for piecewise linear transfer functions, the net- work objective function can have no differ- entiable local maxima, which may partially explain why such transfer functions facilitate effective optimisation.


Equivariance Through Parameter-Sharing

arXiv.org Machine Learning

We propose to study equivariance in deep neural networks through parameter symmetries. In particular, given a group $\mathcal{G}$ that acts discretely on the input and output of a standard neural network layer $\phi_{W}: \Re^{M} \to \Re^{N}$, we show that $\phi_{W}$ is equivariant with respect to $\mathcal{G}$-action iff $\mathcal{G}$ explains the symmetries of the network parameters $W$. Inspired by this observation, we then propose two parameter-sharing schemes to induce the desirable symmetry on $W$. Our procedures for tying the parameters achieve $\mathcal{G}$-equivariance and, under some conditions on the action of $\mathcal{G}$, they guarantee sensitivity to all other permutation groups outside $\mathcal{G}$.


Variational Dropout Sparsifies Deep Neural Networks

arXiv.org Machine Learning

We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.


Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis

arXiv.org Machine Learning

We provide a bridge between generative modeling in the Machine Learning community and simulated physical processes in High Energy Particle Physics by applying a novel Generative Adversarial Network (GAN) architecture to the production of jet images -- 2D representations of energy depositions from particles interacting with a calorimeter. We propose a simple architecture, the Location-Aware Generative Adversarial Network, that learns to produce realistic radiation patterns from simulated high energy particle collisions. The pixel intensities of GAN-generated images faithfully span over many orders of magnitude and exhibit the desired low-dimensional physical properties (i.e., jet mass, n-subjettiness, etc.). We shed light on limitations, and provide a novel empirical validation of image quality and validity of GAN-produced simulations of the natural world. This work provides a base for further explorations of GANs for use in faster simulation in High Energy Particle Physics.


5 Free Courses for Getting Started in Artificial Intelligence

@machinelearnbot

Don't know where or how to start learning? But learning more about artificial intelligence, and the myriad overlapping and related fields and application domains does not require a PhD. Getting started can be intimidating, but don't be discouraged; check out this motivating and inspirational post, the author of which went from little understanding of machine learning to actively and effectively utilizing techniques in their job within a year. With more and more institutes of higher learning today making the decision to allow course materials to be openly accessible to non-students via the magic of the web, all of a sudden a pseudo-university course experience can be had by almost anyone, anywhere. Have a look at the following free course materials, all of which are appropriate for an introductory level of AI understanding, some of which also cover niche application concepts and material.


DeepMind Shows AI Has Trouble Seeing Homer Simpson's Actions

#artificialintelligence

Those findings from DeepMind, the pioneering London-based AI lab, also suggest the motive behind why DeepMind has created a huge new dataset of YouTube clips to help train AI on identifying human actions in videos that go well beyond "Mmm, doughnuts" or "Doh!" To help improve AI's capability to recognize human actions in motion, DeepMind has unveiled its Kinetics dataset consisting of 300,000 video clips and 400 human action classes. Past cases have shown how imbalanced training datasets can lead to deep learning algorithms performing worse at recognizing the faces of certain ethnic groups. This means that even the Kinetics action classes featuring mostly male participants--such as "playing poker" or "hammer throw"--did not seem to bias AI to the point where the deep learning algorithms had trouble recognizing female participants performing the same actions.


Nvidia's Next Big Thing: The HGX-1 AI Platform

#artificialintelligence

Over the past three months, Nvidia's (NASDAQ:NVDA) stock has been upgraded by several financial services firms including Goldman Sachs (NYSE:GS), Citigroup (NYSE:C), and Bernstein, while some others have downgraded the stock, such as Pacific Crest. In an article published in December, last year, I said Nvidia's stock could scale new highs if the company's revenue continues to grow at a CAGR of 20% plus in the foreseeable future. At that time, the stock created a new high around $120, before correcting almost 20% afterwards. I also cautioned investors that the stock could go through spine-chilling volatility, and that's exactly what is happening now. The commentaries of several sell-side analyst firms fueled the extent of volatility beyond what happens under normal circumstances.


The top 10 deep learning frameworks PACKT Books

#artificialintelligence

This is the age of artificial intelligence. Machine Learning and predictive analytics are now established and integral to just about every modern businesses, but artificial intelligence expands the scale of what's possible within those fields. It's what makes deep learning possible. Systems with greater ostensible autonomy and complexity can solve similarly complex problems. If Deep Learning is able to solve more complex problems and perform tasks of greater sophistication, building them is naturally a bigger challenge for data scientists and engineers.