Goto

Collaborating Authors

 Perceptrons


Reinforcement Learning using Augmented Neural Networks

arXiv.org Machine Learning

Neural networks allow Q-learning reinforcement learning agents such as deep Q-networks (DQN) to approximate complex mappings from state spaces to value functions. However, this also brings drawbacks when compared to other function approximators such as tile coding or their generalisations, radial basis functions (RBF) because they introduce instability due to the side effect of globalised updates present in neural networks. This instability does not even vanish in neural networks that do not have any hidden layers. In this paper, we show that simple modifications to the structure of the neural network can improve stability of DQN learning when a multi-layer perceptron is used for function approximation.


Built-in Vulnerabilities to Imperceptible Adversarial Perturbations

arXiv.org Machine Learning

Designing models that are robust to small adversarial perturbations of their inputs has proven remarkably difficult. In this work we show that the reverse problem---making models more vulnerable---is surprisingly easy. After presenting some proofs of concept on MNIST, we introduce a generic tilting attack that injects vulnerabilities into the linear layers of pre-trained networks without affecting their performance on natural data. We illustrate this attack on a multilayer perceptron trained on SVHN and use it to design a stand-alone adversarial module which we call a steganogram decoder. Finally, we show on CIFAR-10 that a state-of-the-art network can be trained to misclassify images in the presence of imperceptible backdoor signals. These different results suggest that adversarial perturbations are not always informative of the true features used by a model.


Predicting Switching Graph Labelings with Cluster Specialists

arXiv.org Machine Learning

We address the problem of predicting the labeling of a graph in an online setting when the labeling is changing over time. We provide three mistake-bounded algorithms based on three paradigmatic methods for online algorithm design. The algorithm with the strongest guarantee is a quasi-Bayesian classifier which requires $\mathcal{O}(t \log n)$ time to predict at trial $t$ on an $n$-vertex graph. The fastest algorithm (with the weakest guarantee) is based on a specialist [10] approach and surprisingly only requires $\mathcal{O}(\log n)$ time on any trial $t$. We also give an algorithm based on a kernelized Perceptron with an intermediate per-trial time complexity of $\mathcal{O}(n)$ and a mistake bound which is not strictly comparable. Finally, we provide experiments on simulated data comparing these methods.


On the Perceptron's Compression

arXiv.org Machine Learning

We study and provide exposition to several phenomena that are related to the perceptron's compression. One theme concerns modifications of the perceptron algorithm that yield better guarantees on the margin of the hyperplane it outputs. These modifications can be useful in training neural networks as well, and we demonstrate them with some experimental data. In a second theme, we deduce conclusions from the perceptron's compression in various contexts.


Recurrent Relational Networks

arXiv.org Artificial Intelligence

This paper is concerned with learning to solve tasks that require a chain of interdependent steps of relational inference, like answering complex questions about the relationships between objects, or solving puzzles where the smaller elements of a solution mutually constrain each other. We introduce the recurrent relational network, a general purpose module that operates on a graph representation of objects. As a generalization of Santoro et al. [2017]'s relational network, it can augment any neural network model with the capacity to do many-step relational reasoning. We achieve state of the art results on the bAbI textual question-answering dataset with the recurrent relational network, consistently solving 20/20 tasks. As bAbI is not particularly challenging from a relational reasoning point of view, we introduce Pretty-CLEVR, a new diagnostic dataset for relational reasoning. In the Pretty-CLEVR set-up, we can vary the question to control for the number of relational reasoning steps that are required to obtain the answer. Using Pretty-CLEVR, we probe the limitations of multi-layer perceptrons, relational and recurrent relational networks. Finally, we show how recurrent relational networks can learn to solve Sudoku puzzles from supervised training data, a challenging task requiring upwards of 64 steps of relational reasoning. We achieve state-of-the-art results amongst comparable methods by solving 96.6% of the hardest Sudoku puzzles.


Machine Learning Optimization Using Genetic Algorithm

@machinelearnbot

In this course, you will learn what hyperparameters are, what Genetic Algorithm is, and what hyperparameter optimization is. In this course, you will apply Genetic Algorithm to optimize the performance of Support Vector Machines and Multilayer Perceptron Neural Networks. Hyperparameter optimization will be done on two datasets, a regression dataset for the prediction of cooling and heating loads of buildings, and a classification dataset regarding the classification of emails into spam and non-spam. The SVM and MLP will be applied on the datasets without optimization and compare their results to after their optimization. By the end of this course, you will have learnt how to code Genetic Algorithm in Python and how to optimize your Machine Learning algorithms for maximal performance.


[D] Hinton: Multi-layer neural networks should never been called MLPs โ€ข r/MachineLearning

@machinelearnbot

Not sure when the term Multi-Layer Perceptron was coined (in terms of multi-layer, fully-connected, feedforward neural net with non-linear activation functions and fit via backprop), but I assume it was in the 1980s around the time of Rumelhard et al.'s backprop paper. So in that context, Perceptron referred to the linear, binary classifier that uses some kind of step-function flavor to update the weights (as opposed to the delta rule or backprop). Or in short, I think around the time the term MLP was (re?)-coined, there was only one common "Rosenblatt Perceptron"


Expectation propagation: a probabilistic view of Deep Feed Forward Networks

arXiv.org Machine Learning

We present a statistical mechanics model of deep feed forward neural networks (FFN). Our energy-based approach naturally explains several known results and heuristics, providing a solid theoretical framework and new instruments for a systematic development of FFN. We infer that FFN can be understood as performing three basic steps: encoding, representation validation and propagation. We obtain a set of natural activations - such as sigmoid, tanh and ReLu - together with a state-of-the-art one, recently obtained by Ramachandran et al. [1] using an extensive search algorithm. We term this activation ESP (Expected Signal Propagation), explain its probabilistic meaning, and study the eigenvalue spectrum of the associated Hessian on classification tasks. We find that ESP allows for faster training and more consistent performances over a wide range of network architectures.


A Resampling Approach for Imbalanceness on Music Genre Classification using Spectrograms

AAAI Conferences

In real-world problems, modeled as machine learning tasks, the datasets are typically unbalanced, meaning that some classes have much more instances than others. In the Music Information Retrieval field it is not different and songs datasets usually are very unbalanced. Considering this scenario, we propose a novel approach to face the class imbalance problem applied to music genre classification. The proposed method uses vertical sliced spectrograms extracted from the songs' audio signal to apply oversampling and undersampling into the minority and majority classes, respectively. The experimental results for F-Score measure showed that our approach was able to beat the best result of Random Undersampling technique by 0.086, using MultiLayer Perceptrons. Besides, comparing to the baseline results, our approach significantly increased the individual results for all the minority classes.


Constructive Preference Elicitation over Hybrid Combinatorial Spaces

arXiv.org Artificial Intelligence

Preference elicitation is the task of suggesting a highly preferred configuration to a decision maker. The preferences are typically learned by querying the user for choice feedback over pairs or sets of objects. In its constructive variant, new objects are synthesized "from scratch" by maximizing an estimate of the user utility over a combinatorial (possibly infinite) space of candidates. In the constructive setting, most existing elicitation techniques fail because they rely on exhaustive enumeration of the candidates. A previous solution explicitly designed for constructive tasks comes with no formal performance guarantees, and can be very expensive in (or unapplicable to) problems with non-Boolean attributes. We propose the Choice Perceptron, a Perceptron-like algorithm for learning user preferences from set-wise choice feedback over constructive domains and hybrid Boolean-numeric feature spaces. We provide a theoretical analysis on the attained regret that holds for a large class of query selection strategies, and devise a heuristic strategy that aims at optimizing the regret in practice. Finally, we demonstrate its effectiveness by empirical evaluation against existing competitors on constructive scenarios of increasing complexity.