Collaborating Authors

Estimating the Bayes Risk from Sample Data

Neural Information Processing Systems

In this setting, each pattern, represented as an n-dimensional feature vector, is associated with a discrete pattern class, or state of nature (Duda and Hart, 1973). Using available information, (e.g., a statistically representative set of labeled feature vectors

Designing Linear Threshold Based Neural Network Pattern Classifiers

Neural Information Processing Systems

Terrence L. Fine School of Electrical Engineering Cornell University Ithaca, NY 14853 Abstract The three problems that concern us are identifying a natural domain of pattern classification applications of feed forward neural networks, selecting anappropriate feedforward network architecture, and assessing the tradeoff between network complexity, training set size, and statistical reliability asmeasured by the probability of incorrect classification. We close with some suggestions, for improving the bounds that come from Vapnik Chervonenkis theory, that can narrow, but not close, the chasm between theory and practice. Neural networks are appropriate as pattern classifiers when the pattern sources are ones of which we have little understanding, beyond perhaps a nonparametric statistical model, but we have been provided with classified samples of features drawn from each of the pattern categories. Neural networks should be able to provide rapid and reliable computation of complex decision functions. The issue in doubt is their statistical response to new inputs.

Using Genetic Algorithms to Improve Pattern Classification Performance

Neural Information Processing Systems

Feature selection and creation are two of the most important and difficult tasks in the field of pattern classification. Good features improve the performance of both conventional and neural network pattern classifiers. Exemplar selection is another task that can reduce the memory and computation requirements of a KNN classifier. These three tasks require a search through a space which is typically so large that 797 798 Chang and Lippmann exhaustive search is impractical. The purpose of this research was to explore the usefulness of Genetic search algorithms for these tasks.

The Impact of Unlabeled Patterns in Rademacher Complexity Theory for Kernel Classifiers

Neural Information Processing Systems

We derive here new generalization bounds, based on Rademacher Complexity theory, for model selection and error estimation of linear (kernel) classifiers, which exploit the availability of unlabeled samples. In particular, two results are obtained: the first one shows that, using the unlabeled samples, the confidence term of the conventional bound can be reduced by a factor of three; the second one shows that the unlabeled samples can be used to obtain much tighter bounds, by building localized versions of the hypothesis class containing the optimal classifier. Papers published at the Neural Information Processing Systems Conference.