Goto

Collaborating Authors

 Technology


An Analog VLSI Splining Network

Neural Information Processing Systems

We have produced a VLSI circuit capable of learning to approximate arbitrary smooth of a single variable using a technique closely related to splines. The circuit effectively has 512 knots space on a uniform grid and has full support for learning. The circuit also can be used to approximate multi-variable functions as sum of splines. An interesting, and as of yet, nearly untapped set of applications for VLSI implementation of neural network learning systems can be found in adaptive control and nonlinear signal processing. In most such applications, the learning task consists of approximating a real function of a small number of continuous variables from discrete data points.


Compact EEPROM-based Weight Functions

Neural Information Processing Systems

The recent surge of interest in neural networks and parallel analog computation has motivated the need for compact analog computing blocks. Analog weighting is an important computational function of this class. Analog weighting is the combining of two analog values, one of which is typically varying (the input) and one of which is typically fixed (the weight) or at least varying more slowly. The varying value is "weighted" by the fixed value through the "weighting function", typically multiplication. Analog weighting is most interesting when the overall computational task involves computing the "weighted sum of the inputs."


Kohonen Networks and Clustering: Comparative Performance in Color Clustering

Neural Information Processing Systems

"vector quantization", and "unsupervised learning" are all words which descn'be the same process: assigning a few exemplars to represent a large set of samples. Perfonning that process is the subject of a substantial body of literature. In this paper, we are concerned with the comparison of various clustering techniques to a particular, practical application: color clustering. The color clustering problem is as follows: an image is recorded in full color -- that is, three components, RED, GREEN, and BLUE, each of which has been measured to 8 bits of precision. Thus, each pixel is a 24 bit quantity. We must find a representation in which 2563 possible colors are represented by only 8 bits per pixel. That is, for a problem with 256000 variables (512 x 512) variables, assign each variable to one of only 256 classes. The color clustering problem is currently of major economic interest since millions of display systems are sold each year which can only store 8 bits per pixel, but on which users would like to be able to display "true" color (or at least as near true color as possible). In this study, we have approached the problem using the standard techniques from the literature (including k-means -- ISODATA clustering[1,3,61, LBG[4]), competitive learning (referred to as CL herein) [2], and Kohonen feature maps [5,7,9].


Time Trials on Second-Order and Variable-Learning-Rate Algorithms

Neural Information Processing Systems

The performance of seven minimization algorithms are compared on five neural network problems. These include a variable-step-size algorithm, conjugate gradient, and several methods with explicit analytic or numerical approximations to the Hessian.


A Comparative Study of the Practical Characteristics of Neural Network and Conventional Pattern Classifiers

Neural Information Processing Systems

Seven different neural network and conventional pattern classifiers were compared using artificial and speech recognition tasks. High order polynomial GMDH classifiers typically provided intermediate error rates and often required long training times and large amounts of memory. In addition, the decision regions formed did not generalize well to regions of the input space with little training data. Radial basis function classifiers generalized well in high dimensional spaces, and provided low error rates with training times that were much less than those of back-propagation classifiers (Lee and Lippmann, 1989). Gaussian mixture classifiers provided good performance when the numbers and types of mixtures were selected carefully to model class densities well. Linear tree classifiers were the most computationally ef- 976 Ng and Lippmann ficient but performed poorly with high dimensionality inputs and when the number of training patterns was small. KD-tree classifiers reduced classification time by a factor of four over conventional KNN classifiers for low 2-input dimension problems. They provided little or no reduction in classification time for high 22-input dimension problems. Improved condensed KNN classifiers reduced memory requirements over conventional KNN classifiers by a factor of two to fifteen for all problems, without increasing the error rate significantly.


Comparison of three classification techniques: CART, C4.5 and Multi-Layer Perceptrons

Neural Information Processing Systems

In this paper, after some introductory remarks into the classification problem as considered in various research communities, and some discussions concerning some of the reasons for ascertaining the performances of the three chosen algorithms, viz., CART (Classification and Regression Tree), C4.5 (one of the more recent versions of a popular induction tree technique known as ID3), and a multi-layer perceptron (MLP), it is proposed to compare the performances of these algorithms under two criteria: classification and generalisation. It is found that, in general, the MLP has better classification and generalisation accuracies compared with the other two algorithms. 1 Introduction Classification of data into categories has been pursued by a number of research communities, viz., applied statistics, knowledge acquisition, neural networks. In applied statistics, there are a number of techniques, e.g., clustering algorithms (see e.g., Hartigan), CART (Classification and Regression Trees, see e.g., Breiman et al). Clustering algorithms are used when the underlying data naturally fall into a number of groups, the distance among groups are measured by various metrics [Hartigan]. CART [Breiman, et all has been very popular among applied statisticians. It assumes that the underlying data can be separated into categories, the decision boundaries can either be parallel to the axis or they can be a linear combination of these axes!. Under certain assumptions on the input data and their associated lIn CART, and C4.5, the axes are the same as the input features


On the Circuit Complexity of Neural Networks

Neural Information Processing Systems

Viewing n-variable boolean functions as vectors in'R'2", we invoke tools from linear algebra and linear programming to derive new results on the realizability of boolean functions using threshold gat.es. Using this approach, one can obtain: (1) upper-bounds on the number of spurious memories in HopfielJ networks, and on the number of functions implementable by a depth-d threshold circuit; (2) a lower bound on the number of ort.hogonal input.



Remarks on Interpolation and Recognition Using Neural Nets

Neural Information Processing Systems

We consider different types of single-hidden-Iayer feedforward nets: with or without direct input to output connections, and using either threshold or sigmoidal activation functions. The main results show that direct connections in threshold nets double the recognition but not the interpolation power, while using sigmoids rather than thresholds allows (at least) doubling both. Various results are also given on VC dimension and other measures of recognition capabilities.


Asymptotic slowing down of the nearest-neighbor classifier

Neural Information Processing Systems

M2/n' for sufficiently large values of M. Here, Poo(error) denotes the probability of error in the infinite sample limit, and is at most twice the error of a Bayes classifier. Although the value of the coefficient a depends upon the underlying probability distributions, the exponent of M is largely distribution free. We thus obtain a concise relation between a classifier's ability to generalize from a finite reference sample and the dimensionality of the feature space, as well as an analytic validation of Bellman's well known "curse of dimensionality." 1 INTRODUCTION One of the primary tasks assigned to neural networks is pattern classification. Common applications include recognition problems dealing with speech, handwritten characters, DNA sequences, military targets, and (in this conference) sexual identity. Two fundamental concepts associated with pattern classification are generalization (how well does a classifier respond to input data it has never encountered before?) and scalability (how are a classifier's processing and training requirements affected by increasing the number of features that describe the input patterns?).