Goto

Collaborating Authors

 Perceptrons


[D] How important is it to adjust the bias in a perceptron learning algorithm? • r/MachineLearning

@machinelearnbot

The thing to realize is that for a two-dimensional problem, the decision boundary is a line, and a line can be uniquely identified by two parameters (e.g. its slope and its y-intercept, or equivalently x-intercept). If you set up the perceptron as having two weights and also a tunable bias, that's three parameters, more than you actually need. However, if you fixed the bias to zero, you'd lose the ability to represent lines that don't pass through the origin. But as long as bias is fixed to a non-zero constant (1 is as good as any other, modulo taking a different number of steps to converge depending on your initialization) your representational power is maintained.


Transparent Model Distillation

arXiv.org Machine Learning

Model distillation was originally designed to distill knowledge from a large, complex teacher model to a faster, simpler student model without significant loss in prediction accuracy. We investigate model distillation for another goal -- transparency -- investigating if fully-connected neural networks can be distilled into models that are transparent or interpretable in some sense. Our teacher models are multilayer perceptrons, and we try two types of student models: (1) tree-based generalized additive models (GA2Ms), a type of boosted, short tree (2) gradient boosted trees (GBTs). More transparent student models are forthcoming. Our results are not yet conclusive. GA2Ms show some promise for distilling binary classification teachers, but not yet regression. GBTs are not "directly" interpretable but may be promising for regression teachers. GA2M models may provide a computationally viable alternative to additive decomposition methods for global function approximation.


Is Learning Rate Useful in Artificial Neural Networks?

@machinelearnbot

This article will help you understand why we need the learning rate and whether it is useful or not for training an artificial neural network. Using a very simple Python code for a single layer perceptron, the learning rate value will get changed to catch its idea. An obstacle for newbies in artificial neural networks is the learning rate. I was asked many times about the effect of the learning rate in the training of the artificial neural networks (ANNs). Why we use learning rate?


Benchmarking Decoupled Neural Interfaces with Synthetic Gradients

arXiv.org Machine Learning

Artifical Neural Networks are a particular class of learning systems modeled after biological neural functions with an interesting penchant for Hebbian learning, that is "neurons that fire together, wire together". However, unlike their natural counterparts, artificial neural networks have a close and stringent coupling between the modules of neurons in the network. This coupling or locking imposes upon the network a strict and inflexible structure that prevent layers in the network from updating their weights until a full feed-forward and backward pass has occurred. Such a constraint though may have sufficed for a while, is now no longer feasible in the era of very-large-scale machine learning, coupled with the increased desire for parallelization of the learning process across multiple computing infrastructures. To solve this problem, synthetic gradients (SG) with decoupled neural interfaces (DNI) are introduced as a viable alternative to the backpropagation algorithm. This paper performs a speed benchmark to compare the speed and accuracy capabilities of SG-DNI as opposed to a standard neural interface using multilayer perceptron MLP. SG-DNI shows good promise, in that it not only captures the learning problem, it is also over 3-fold faster due to it asynchronous learning capabilities.


Revisiting Perceptron: Efficient and Label-Optimal Learning of Halfspaces

Neural Information Processing Systems

It has been a long-standing problem to efficiently learn a halfspace using as few labels as possible in the presence of noise. In this work, we propose an efficient Perceptron-based algorithm for actively learning homogeneous halfspaces under the uniform distribution over the unit sphere. Under the bounded noise condition~\cite{MN06}, where each label is flipped with probability at most $\eta < \frac 1 2$, our algorithm achieves a near-optimal label complexity of $\tilde{O}\left(\frac{d}{(1-2\eta)^2}\ln\frac{1}{\epsilon}\right)$ in time $\tilde{O}\left(\frac{d^2}{\epsilon(1-2\eta)^3}\right)$. Under the adversarial noise condition~\cite{ABL14, KLS09, KKMS08}, where at most a $\tilde \Omega(\epsilon)$ fraction of labels can be flipped, our algorithm achieves a near-optimal label complexity of $\tilde{O}\left(d\ln\frac{1}{\epsilon}\right)$ in time $\tilde{O}\left(\frac{d^2}{\epsilon}\right)$. Furthermore, we show that our active learning algorithm can be converted to an efficient passive learning algorithm that has near-optimal sample complexities with respect to $\epsilon$ and $d$.


An Artificial Neural Network-based Stock Trading System Using Technical Analysis and Big Data Framework

arXiv.org Machine Learning

In this paper, a neural network-based stock price prediction and trading system using technical analysis indicators is presented. The model developed first converts the financial time series data into a series of buy-sell-hold trigger signals using the most commonly preferred technical analysis indicators. Then, a Multilayer Perceptron (MLP) artificial neural network (ANN) model is trained in the learning stage on the daily stock prices between 1997 and 2007 for all of the Dow30 stocks. Apache Spark big data framework is used in the training stage. The trained model is then tested with data from 2007 to 2017. The results indicate that by choosing the most appropriate technical indicators, the neural network model can achieve comparable results against the Buy and Hold strategy in most of the cases. Furthermore, fine tuning the technical indicators and/or optimization strategy can enhance the overall trading performance.


Tensor Regression Networks with various Low-Rank Tensor Approximations

arXiv.org Machine Learning

Tensor regression networks achieve high rate of compression of model parameters in multilayer perceptrons (MLP) while having slight impact on performances. Tensor regression layer imposes low-rank constraints on the tensor regression layer which replaces the flattening operation of traditional MLP. We investigate tensor regression networks using various low-rank tensor approximations, aiming to leverage the multi-modal structure of high dimensional data by enforcing efficient low-rank constraints. We provide a theoretical analysis giving insights on the choice of the rank parameters. We evaluated performance of proposed model with state-of-the-art deep convolutional models. For CIFAR-10 dataset, we achieved the compression rate of 0.018 with the sacrifice of accuracy less than 1%.


Autism Classification Using Brain Functional Connectivity Dynamics and Machine Learning

arXiv.org Machine Learning

The goal of the present study is to identify autism using machine learning techniques and resting-state brain imaging data, leveraging the temporal variability of the functional connections (FC) as the only information. We estimated and compared the FC variability across brain regions between typical, healthy subjects and autistic population by analyzing brain imaging data from a world-wide multi-site database known as ABIDE (Autism Brain Imaging Data Exchange). Our analysis revealed that patients diagnosed with autism spectrum disorder (ASD) show increased FC variability in several brain regions that are associated with low FC variability in the typical brain. We then used the enhanced FC variability of brain regions as features for training machine learning models for ASD classification and achieved 65% accuracy in identification of ASD versus control subjects within the dataset. We also used node strength estimated from number of functional connections per node averaged over the whole scan as features for ASD classification.The results reveal that the dynamic FC measures outperform or are comparable with the static FC measures in predicting ASD.


Comparison of Deepnet & Neuralnet

@machinelearnbot

Based on two R packages for neural networks. In this article, I compare two available R packages for using neural networks to model data: neuralnet and deepnet. Through the comparisons I highlight various challenges in finding good hyperparameter values. I show that some needed hyperparameters differ when using these two packages, even with the same underlying algorithmic approach. Both packages can be obtained via the R CRAN repository (see links at the end). I will focus on a simple time series example, composed of two predictors and the performance of the packages to predict future data after being trained on past data using a simple 5-neuron network. Note that most of what you read about in deep learning with neural networks are "classification" problems (more later); nonetheless such networks have promise for predicting continuous data including time series. Briefly, a neural network (also called a multilayer-perceptron etc.) is a connected network of neurons as shown here. An example neural network (generated using neuralnet). Note that except for the input layer (where the predictor values are fed in), the inputs to a neuron have weights specific to that neuron, so the output of a neuron is "re-used" as input to all neurons in the next layer, with unique weights. Before moving on to a brief description of how neural networks compute predictions, it is worth reflecting on the number of independent parameters in neural network models as compared to, for example, linear regression.


[P] My implementations of neural algorithms - multilayer perceptron, neural gas, Kohonen SOM • r/MachineLearning

@machinelearnbot

Src: github There are dependencies like openCV and Apache Spark but they are optional. I used openCV to perform feature extraction by HOG which speeds up the learning process, apache spark to compare results. Utils classes support computing additional data like confusion matrix and add methods to play with some well known datasets like iris or mnist. As it doesn't require any external libraries maybe someone will find it helpful when studying basics of machine learning.