Goto

Collaborating Authors

 Supervised Learning


Optimizing Classifers for Imbalanced Training Sets

Neural Information Processing Systems

Following recent results [9, 8] showing the importance of the fatshattering dimensionin explaining the beneficial effect of a large margin on generalization performance, the current paper investigates theimplications of these results for the case of imbalanced datasets and develops two approaches to setting the threshold. The approaches are incorporated into ThetaBoost, a boosting algorithm fordealing with unequal loss functions. The performance of ThetaBoost and the two approaches are tested experimentally.


3D Object Recognition: A Model of View-Tuned Neurons

Neural Information Processing Systems

Recognition of specific objects, such as recognition of a particular face, can be based on representations that are object centered, such as 3D structural models. Alternatively, a 3D object may be represented for the purpose of recognition in terms of a set of views. This latter class of models is biologically attractive because model acquisition - the learning phase - is simpler and more natural. A simple model for this strategy of object recognition was proposed by Poggio and Edelman (Poggio and Edelman, 1990). They showed that, with few views of an object used as training examples, a classification network, such as a Gaussian radial basis function network, can learn to recognize novel views of that object, in partic- 42 E. Bricolo, T. Poggio and N. Logothetis


3D Object Recognition: A Model of View-Tuned Neurons

Neural Information Processing Systems

Recognition of specific objects, such as recognition of a particular face, can be based on representations that are object centered, such as 3D structural models. Alternatively, a 3D object may be represented for the purpose of recognition in terms of a set of views. This latter class of models is biologically attractive because model acquisition - the learning phase - is simpler and more natural. A simple model for this strategy of object recognition was proposed by Poggio and Edelman (Poggio and Edelman, 1990). They showed that, with few views of an object used as training examples, a classification network, such as a Gaussian radial basis function network, can learn to recognize novel views of that object, in partic- 42 E. Bricolo, T. Poggio and N. Logothetis


3D Object Recognition: A Model of View-Tuned Neurons

Neural Information Processing Systems

Recognition of specific objects, such as recognition of a particular face, can be based on representations that are object centered, such as 3D structural models. Alternatively, a 3D object may be represented for the purpose of recognition in terms of a set of views. This latter class of models is biologically attractive because model acquisition - the learning phase - is simpler and more natural. A simple model for this strategy of object recognition was proposed by Poggio and Edelman (Poggio and Edelman, 1990). They showed that, with few views of an object usedas training examples, a classification network, such as a Gaussian radial basis function network, can learn to recognize novel views of that object, in partic- 42 E.Bricolo, T. Poggio and N. Logothetis (a) (b) View angle Figure 1: (a) Schematic representation of the architecture of the Poggio-Edelman model. The shaded circles correspond to the view-tuned units, each tuned to a view of the object, while the open circle correspond to the view-invariant, object specific output unit.


Induction of First-Order Decision Lists: Results on Learning the Past Tense of English Verbs

Journal of Artificial Intelligence Research

This paper presents a method for inducing logic programs from examples that learns a new class of concepts called first-order decision lists, defined as ordered lists of clauses each ending in a cut. The method, called FOIDL, is based on FOIL (Quinlan, 1990) but employs intensional background knowledge and avoids the need for explicit negative examples. It is particularly useful for problems that involve rules with specific exceptions, such as learning the past-tense of English verbs, a task widely studied in the context of the symbolic/connectionist debate. FOIDL is able to learn concise, accurate programs for this problem from significantly fewer examples than previous methods (both connectionist and symbolic).


Shooting Craps in Search of an Optimal Strategy for Training Connectionist Pattern Classifiers

Neural Information Processing Systems

We compare two strategies for training connectionist (as well as nonconnectionist) models for statistical pattern recognition. The probabilistic strategy is based on the notion that Bayesian discrimination (i.e.- optimal classification) is achieved when the classifier learns the a posteriori class distributions of the random feature vector. The differential strategy is based on the notion that the identity of the largest class a posteriori probability of the feature vector is all that is needed to achieve Bayesian discrimination. Each strategy is directly linked to a family of objective functions that can be used in the supervised training procedure. We prove that the probabilistic strategy - linked with error measure objective functions such as mean-squared-error and cross-entropy - typically used to train classifiers necessarily requires larger training sets and more complex classifier architectures than those needed to approximate the Bayesian discriminant function.


Shooting Craps in Search of an Optimal Strategy for Training Connectionist Pattern Classifiers

Neural Information Processing Systems

We compare two strategies for training connectionist (as well as nonconnectionist) modelsfor statistical pattern recognition. The probabilistic strategy is based on the notion that Bayesian discrimination (i.e.- optimal classification) isachieved when the classifier learns the a posteriori class distributions of the random feature vector. The differential strategy is based on the notion that the identity of the largest class a posteriori probability of the feature vector is all that is needed to achieve Bayesian discrimination. Each strategy is directly linked to a family ofobjective functions that can be used in the supervised training procedure. We prove that the probabilistic strategy - linked with error measure objective functions such as mean-squared-error and cross-entropy - typically used to train classifiers necessarily requires larger training sets and more complex classifier architectures than those needed to approximate the Bayesian discriminant function.In contrast.


VLSI Implementation of TInMANN

Neural Information Processing Systems

A massively parallel, all-digital, stochastic architecture - TlnMAN N - is described which performs competitive and Kohonen types of learning. A VLSI design is shown for a TlnMANN neuron which fits within a small, inexpensive MOSIS TinyChip frame, yet which can be used to build larger networks of several hundred neurons. The neuron operates at a speed of 15 MHz which allows the network to process 290,000 training examples per second. Use of level sensitive scan logic provides the chip with 100% fault coverage, permitting very reliable neural systems to be built.



VLSI Implementation of TInMANN

Neural Information Processing Systems

A massively parallel, all-digital, stochastic architecture - TlnMAN N - is described which performs competitive and Kohonen types of learning. A VLSI design is shown for a TlnMANN neuron which fits within a small, inexpensive MOSIS TinyChip frame, yet which can be used to build larger networks of several hundred neurons. The neuron operates at a speed of 15 MHz which allows the network to process 290,000 training examples per second. Use of level sensitive scan logic provides the chip with 100% fault coverage, permitting very reliable neural systems to be built.