Perceptrons
A Novel Representation of Neural Networks
Caterini, Anthony, Chang, Dong Eui
Deep Neural Networks (DNNs) have become very popular for prediction in many areas. Their strength is in representation with a high number of parameters that are commonly learned via gradient descent or similar optimization methods. However, the representation is non-standardized, and the gradient calculation methods are often performed using component-based approaches that break parameters down into scalar units, instead of considering the parameters as whole entities. In this work, these problems are addressed. Standard notation is used to represent DNNs in a compact framework. Gradients of DNN loss functions are calculated directly over the inner product space on which the parameters are defined. This framework is general and is applied to two common network types: the Multilayer Perceptron and the Deep Autoencoder.
The rapid evolution of open-source machine learning – Seldon -- Open Source Machine Learning
When millions of people across the world tuned in to watch DeepMind's machine beat the human Go world champion Lee Sedol, they also witnessed a historic victory for open-source. DeepMind used a scientific computing framework called Torch extensively in the development and execution of AlphaGo's neural networks. Torch was first released back in 2002 under a BSD open-source license with algorithms that are still commonly used by data scientists such as multi-layer perceptrons, support vector machines and K-nearest neighbours. Torch also supported ensembles -- a popular technique that combines the output of multiple algorithms, usually with a weighted average. It's not just open-source software that contributed to the growth of machine learning.
ashleylester/perceptron
You then need to import a training set of data. The training set is a CSV file (the format CSV is chosen so you can export from a spreadsheet). Where each delimited value is a floating-point number (or integer) that corresponds to the respective input and output units. Where path is your data file, the path being relative to the network.rb
Very Simple Classifier: a Concept Binary Classifier toInvestigate Features Based on Subsampling and Localility
Masera, Luca, Blanzieri, Enrico
We propose Very Simple Classifier (VSC) a novel method designed to incorporate the concepts of subsampling and locality in the definition of features to be used as the input of a perceptron. The rationale is that locality theoretically guarantees a bound on the generalization error. Each feature in VSC is a max-margin classifier built on randomly-selected pairs of samples. The locality in VSC is achieved by multiplying the value of the feature by a confidence measure that can be characterized in terms of the Chebichev inequality. The output of the layer is then fed in a output layer of neurons. The weights of the output layer are then determined by a regularized pseudoinverse. Extensive comparison of VSC against 9 competitors in the task of binary classification is carried out. Results on 22 benchmark datasets with fixed parameters show that VSC is competitive with the Multi Layer Perceptron (MLP) and outperforms the other competitors. An exploration of the parameter space shows VSC can outperform MLP.
Just how close are we to solving vision? – Piekniewski's blog
There is a lot of hype today about deep learning, a class of multilayer perceptrons with some 5-20 layers featuring convolutional and polling layers. Many blogs [1,2,3] discuss the structure of these networks, there is plenty code published so I won't get into much detail here. Several tech companies had invested a lot of money into this research and everyone has very high expectations on performance of these models. Indeed they've been winning image classification competitions for several years now and media are reporting superhuman performance on some visual classification tasks once in a while. Now just looking at the numbers from ImageNet competition is not really telling us much on how good these models really are, we can only maybe confirm that they are much better than whatever came before them (for that benchmark at least).
Perceptron like Algorithms for Online Learning to Rank
Chaudhuri, Sougata, Tewari, Ambuj
Perceptron is a classic online algorithm for learning a classification function. In this paper, we provide a novel extension of the perceptron algorithm to the learning to rank problem in information retrieval. We consider popular listwise performance measures such as Normalized Discounted Cumulative Gain (NDCG) and Average Precision (AP). A modern perspective on perceptron for classification is that it is simply an instance of online gradient descent (OGD), during mistake rounds, using the hinge loss function. Motivated by this interpretation, we propose a novel family of listwise, large margin ranking surrogates. Members of this family can be thought of as analogs of the hinge loss. Exploiting a certain self-bounding property of the proposed family, we provide a guarantee on the cumulative NDCG (or AP) induced loss incurred by our perceptron-like algorithm. We show that, if there exists a perfect oracle ranker which can correctly rank each instance in an online sequence of ranking data, with some margin, the cumulative loss of perceptron algorithm on that sequence is bounded by a constant, irrespective of the length of the sequence. This result is reminiscent of Novikoff's convergence theorem for the classification perceptron. Moreover, we prove a lower bound on the cumulative loss achievable by any deterministic algorithm, under the assumption of existence of perfect oracle ranker. The lower bound shows that our perceptron bound is not tight, and we propose another, \emph{purely online}, algorithm which achieves the lower bound. We provide empirical results on simulated and large commercial datasets to corroborate our theoretical results.
What is the Difference Between Deep Learning and "Regular" Machine Learning?
This time, Sebastian explains the difference between Deep Learning and "regular" machine learning. That's an interesting question, and I try to answer this is a very general way. The tl;dr version of this is: Deep learning is essentially a set of techniques that help we to parameterize deep neural network structures, neural networks with many, many layers and parameters. And if we are interested, a more concrete example: Let's start with multi-layer perceptrons (MLPs)… On a tangent: The term "perceptron" in MLPs may be a bit confusing since we don't really want only linear neurons in our network. Using MLPs, we want to learn complex functions to solve non-linear problems.
Personalized Prognostic Models for Oncology: A Machine Learning Approach
Dooling, David, Kim, Angela, McAneny, Barbara, Webster, Jennifer
We have applied a little-known data transformation to subsets of the Surveillance, Epidemiology, and End Results (SEER) publically available data of the National Cancer Institute (NCI) to make it suitable input to standard machine learning classifiers. This transformation properly treats the right-censored data in the SEER data and the resulting Random Forest and Multi-Layer Perceptron models predict full survival curves. Treating the 6, 12, and 60 months points of the resulting survival curves as 3 binary classifiers, the 18 resulting classifiers have AUC values ranging from .765 to .885. Further evidence that the models have generalized well from the training data is provided by the extremely high levels of agreement between the random forest and neural network models predictions on the 6, 12, and 60 month binary classifiers.