Goto

Collaborating Authors

 alspector


A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks

Neural Information Processing Systems

Typical methods for gradient descent in neural network learning involve calculation of derivatives based on a detailed knowledge of the network model. This requires extensive, time consuming calculations for each pattern presentation and high precision that makes it difficult to implement in VLSI. We present here a perturbation technique that measures, not calculates, the gradient. Since the technique uses the actual network as a measuring device, errors in modeling neuron activation and synaptic weights do not cause errors in gradient descent. The method is parallel in nature and easy to implement in VLSI. We describe the theory of such an algorithm, an analysis of its domain of applicability, some simulations using it and an outline of a hardware implementation.


A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks

Neural Information Processing Systems

Typical methods for gradient descent in neural network learning involve calculation of derivatives based on a detailed knowledge of the network model. This requires extensive, time consuming calculations for each pattern presentation and high precision that makes it difficult to implement in VLSI. We present here a perturbation technique that measures, not calculates, the gradient. Since the technique uses the actual network as a measuring device, errors in modeling neuron activation and synaptic weights do not cause errors in gradient descent. The method is parallel in nature and easy to implement in VLSI. We describe the theory of such an algorithm, an analysis of its domain of applicability, some simulations using it and an outline of a hardware implementation.


A Parallel Gradient Descent Method for Learning in Analog VLSI Neural Networks

Neural Information Processing Systems

Typical methods for gradient descent in neural network learning involve calculation of derivatives based on a detailed knowledge of the network model. This requires extensive, time consuming calculations for each pattern presentationand high precision that makes it difficult to implement in VLSI. We present here a perturbation technique that measures, not calculates, the gradient. Since the technique uses the actual network as a measuring device, errors in modeling neuron activation and synaptic weights do not cause errors in gradient descent. The method is parallel in nature and easy to implement in VLSI.


Experimental Evaluation of Learning in a Neural Microsystem

Neural Information Processing Systems

Joshua Alspector Anthony Jayakumar Stephan Lunat Bellcore Morristown, NJ 07962-1910 Abstract We report learning measurements from a system composed of a cascadable learning chip, data generators and analyzers for training pattern presentation, and an X-windows based software interface. The 32 neuron learning chip has 496 adaptive synapses and can perform Boltzmann and mean-field learning using separate noise and gain controls.


Experimental Evaluation of Learning in a Neural Microsystem

Neural Information Processing Systems

Joshua Alspector Anthony Jayakumar Stephan Lunat Bellcore Morristown, NJ 07962-1910 Abstract We report learning measurements from a system composed of a cascadable learning chip, data generators and analyzers for training pattern presentation, and an X-windows based software interface. The 32 neuron learning chip has 496 adaptive synapses and can perform Boltzmann and mean-field learning using separate noise and gain controls.


Experimental Evaluation of Learning in a Neural Microsystem

Neural Information Processing Systems

Joshua Alspector Anthony Jayakumar Stephan Lunat Bellcore Morristown, NJ 07962-1910 Abstract We report learning measurements from a system composed of a cascadable learning chip, data generators and analyzers for training pattern presentation, and an X-windows based software interface. The 32 neuron learning chip has 496 adaptive synapses and can perform Boltzmann and mean-field learning using separate noise and gain controls.


Relaxation Networks for Large Supervised Learning Problems

Neural Information Processing Systems

Feedback connections are required so that the teacher signal on the output neurons can modify weights during supervised learning. Relaxation methods are needed for learning static patterns with full-time feedback connections. Feedback network learning techniques have not achieved wide popularity because of the still greater computational efficiency of back-propagation. We show by simulation that relaxation networks of the kind we are implementing in VLSI are capable of learning large problems just like back-propagation networks. A microchip incorporates deterministic mean-field theory learning as well as stochastic Boltzmann learning. A multiple-chip electronic system implementing these networks will make high-speed parallel learning in them feasible in the future.


Relaxation Networks for Large Supervised Learning Problems

Neural Information Processing Systems

Feedback connections are required so that the teacher signal on the output neurons can modify weights during supervised learning. Relaxation methods are needed for learning static patterns with full-time feedback connections. Feedback network learning techniques have not achieved wide popularity because of the still greater computational efficiency of back-propagation. We show by simulation that relaxation networks of the kind we are implementing in VLSI are capable of learning large problems just like back-propagation networks. A microchip incorporates deterministic mean-field theory learning as well as stochastic Boltzmann learning. A multiple-chip electronic system implementing these networks will make high-speed parallel learning in them feasible in the future.


Relaxation Networks for Large Supervised Learning Problems

Neural Information Processing Systems

Feedback connections are required so that the teacher signal on the output neurons can modify weights during supervised learning. Relaxation methods are needed for learning static patterns with full-time feedback connections. Feedback network learning techniques have not achieved wide popularity because of the still greater computational efficiency of back-propagation. We show by simulation that relaxation networks of the kind we are implementing in VLSI are capable of learning large problems just like back-propagation networks. A microchip incorporates deterministic mean-field theory learning as well as stochastic Boltzmann learning. A multiple-chip electronic system implementing these networks will make high-speed parallel learning in them feasible in the future.


Performance of a Stochastic Learning Microchip

Neural Information Processing Systems

We have fabricated a test chip in 2 micron CMOS technology that embodies these ideas and we report our evaluation of the microchip and our plans for improvements. Knowledge is encoded in the test chip by presenting digital patterns to it that are examples of a desired input-output Boolean mapping. This knowledge is learned and stored entirely on chip in a digitally controlled synapse-like element in the form of connection strengths between neuron-like elements. The only portion of this learning system which is off chip is the VLSI test equipment used to present the patterns. This learning system uses a modified Boltzmann machine algorithm[3] which, if simulated on a serial digital computer, takes enormous amounts of computer time. Our physical implementation is about 100,000 times faster. The test chip, if expanded to a board-level system of thousands of neurons, would be an appropriate architecture for solving artificial intelligence problems whose solutions are hard to specify using a conventional rule-based approach. Examples include speech and pattern recognition and encoding some types of expert knowledge.