LeCun, Yann
Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities
Vatanen, Tommi, Raiko, Tapani, Valpola, Harri, LeCun, Yann
Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connections to second order optimization methods. We show that the transformations make a simple stochastic gradient behave closer to second-order optimization methods and thus speed up learning. This is shown both in theory and with experiments. The experiments on the third transformation show that while it further increases the speed of learning, it can also hurt performance by converging to a worse local optimum, where both the inputs and outputs of many hidden neurons are close to zero.
No More Pesky Learning Rates
Schaul, Tom, Zhang, Sixin, LeCun, Yann
The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time. We propose a method to automatically adjust multiple learning rates so as to minimize the expected error at any one time. The method relies on local gradient variations across samples. In our approach, learning rates can increase as well as decrease, making it suitable for non-stationary problems. Using a number of convex and non-convex learning tasks, we show that the resulting algorithm matches the performance of SGD or other adaptive approaches with their best settings obtained through systematic search, and effectively removes the need for learning rate tuning.
Signature Verification using a "Siamese" Time Delay Neural Network
Bromley, Jane, Guyon, Isabelle, LeCun, Yann, Säckinger, Eduard, Shah, Roopak
The aim of the project was to make a signature verification system based on the NCR 5990 Signature Capture Device (a pen-input tablet) and to use 80 bytes or less for signature feature storage in order that the features can be stored on the magnetic strip of a credit-card. Verification using a digitizer such as the 5990, which generates spatial coordinates as a function of time, is known as dynamic verification. Much research has been carried out on signature verification.
Signature Verification using a "Siamese" Time Delay Neural Network
Bromley, Jane, Guyon, Isabelle, LeCun, Yann, Säckinger, Eduard, Shah, Roopak
The aim of the project was to make a signature verification system based on the NCR 5990 Signature Capture Device (a pen-input tablet) and to use 80 bytes or less for signature feature storage in order that the features can be stored on the magnetic strip of a credit-card. Verification using a digitizer such as the 5990, which generates spatial coordinates as a function of time, is known as dynamic verification. Much research has been carried out on signature verification. Function-based methods, which fit a function tothe pen trajectory, have been found to lead to higher performance while parameter-based methods, which extract some number of parameters from a signa-737 738 Bromley, Guyon, Le Cun, Sackinger, and Shah ture, make a lower requirement on memory space for signature storage (see Lorette and Plamondon (1990) for comments). We chose to use the complete time extent of the signature, with the preprocessing described below, as input to a neural network, andto allow the network to compress the information.
Globally Trained Handwritten Word Recognizer using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models
Bengio, Yoshua, LeCun, Yann, Henderson, Donnie
We introduce a new approach for online recognition of handwritten wordswritten in unconstrained mixed style. The preprocessor performs a word-level normalization by fitting a model of the word structure using the EM algorithm. Words are then coded into low resolution "annotated images" where each pixel contains information abouttrajectory direction and curvature. The recognizer is a convolution network which can be spatially replicated. From the network output, a hidden Markov model produces word scores.
Signature Verification using a "Siamese" Time Delay Neural Network
Bromley, Jane, Guyon, Isabelle, LeCun, Yann, Säckinger, Eduard, Shah, Roopak
The aim of the project was to make a signature verification system based on the NCR 5990 Signature Capture Device (a pen-input tablet) and to use 80 bytes or less for signature feature storage in order that the features can be stored on the magnetic strip of a credit-card. Verification using a digitizer such as the 5990, which generates spatial coordinates as a function of time, is known as dynamic verification. Much research has been carried out on signature verification.
Globally Trained Handwritten Word Recognizer using Spatial Representation, Convolutional Neural Networks, and Hidden Markov Models
Bengio, Yoshua, LeCun, Yann, Henderson, Donnie
We introduce a new approach for online recognition of handwritten words written in unconstrained mixed style. The preprocessor performs a word-level normalization by fitting a model of the word structure using the EM algorithm. Words are then coded into low resolution "annotated images" where each pixel contains information about trajectory direction and curvature. The recognizer is a convolution network which can be spatially replicated. From the network output, a hidden Markov model produces word scores. The entire system is globally trained to minimize word-level errors. 1 Introduction Natural handwriting is often a mixture of different "styles", lower case printed, upper case, and cursive.