Goto

Collaborating Authors

 Performance Analysis


Human Face Detection in Visual Scenes

Neural Information Processing Systems

We present a neural network-based face detection system. A retinally connected neural network examines small windows of an image, and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We use a bootstrap algorithm for training, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting non-face training examples, which must be chosen to span the entire space of non-face images.


From Isolation to Cooperation: An Alternative View of a System of Experts

Neural Information Processing Systems

We introduce a constructive, incremental learning system for regression problems that models data by means of locally linear experts. In contrast to other approaches, the experts are trained independently and do not compete for data during learning. Only when a prediction for a query is required do the experts cooperate by blending their individual predictions. Eachexpert is trained by minimizing a penalized local cross validation errorusing second order methods. In this way, an expert is able to find a local distance metric by adjusting the size and shape of the receptive fieldin which its predictions are valid, and also to detect relevant input features by adjusting its bias on the importance of individual input dimensions. We derive asymptotic results for our method. In a variety of simulations the properties of the algorithm are demonstrated with respect to interference, learning speed, prediction accuracy, feature detection, and task oriented incremental learning.


Further Experimental Evidence against the Utility of Occam's Razor

Journal of Artificial Intelligence Research

This paper presents new experimental evidence against the utility of Occam's razor. A~systematic procedure is presented for post-processing decision trees produced by C4.5. This procedure was derived by rejecting Occam's razor and instead attending to the assumption that similar objects are likely to belong to the same class. It increases a decision tree's complexity without altering the performance of that tree on the training data from which it is inferred. The resulting more complex decision trees are demonstrated to have, on average, for a variety of common learning tasks, higher predictive accuracy than the less complex original decision trees. This result raises considerable doubt about the utility of Occam's razor as it is commonly applied in modern machine learning.


Improved Use of Continuous Attributes in C4.5

Journal of Artificial Intelligence Research

A reported weakness of C4.5 in domains with continuous attributes is addressed by modifying the formation and evaluation of tests on continuous attributes. An MDL-inspired penalty is applied to such tests, eliminating some of them from consideration and altering the relative desirability of all tests. Empirical trials show that the modifications lead to smaller decision trees with higher predictive accuracies. Results also confirm that a new version of C4.5 incorporating these changes is superior to recent approaches that use global discretization and that construct small trees with multi-interval splits.


Neural Network Ensembles, Cross Validation, and Active Learning

Neural Information Processing Systems

It is well known that a combination of many different predictors can improve predictions. In the neural networks community "ensembles" of neural networks has been investigated by several authors, see for instance [1, 2, 3]. Most often the networks in the ensemble are trained individually and then their predictions are combined. This combination is usually done by majority (in classification) or by simple averaging (in regression), but one can also use a weighted combination of the networks.


Predicting the Risk of Complications in Coronary Artery Bypass Operations using Neural Networks

Neural Information Processing Systems

MLP networks provided slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations. Bootstrap sampling was required to compare approaches and regularization provided by early stopping was an important component of improved performance. A simplified approach to generating confidence intervals for MLP risk predictions using an auxiliary "confidence MLP" was also developed. The confidence MLP is trained to reproduce the confidence bounds that were generated during training by 50 MLP networks trained using bootstrap samples. Current research is validating these results using larger data sets, exploring approaches to detect outlier patients who are so different from any training patient that accurate risk prediction is suspect, developing approaches to explaining which input features are important for an individual patient, and determining why MLP networks provide improved performance.


Inferring Ground Truth from Subjective Labelling of Venus Images

Neural Information Processing Systems

Instead of "ground truth" one may only have the subjective opinion(s) of one or more experts. For example, medical data or image data may be collected off-line and some time later a set of experts analyze the data and produce a set of class labels. The central problem is that of trying to infer the "ground truth" given the noisy subjective estimates of the experts. When one wishes to apply a supervised learning algorithm to the data, the problem is primarily twofold: (i) how to evaluate the relative performance of experts and algorithms, and (ii) how to train a pattern recognition system in the absence of absolute ground truth. In this paper we focus on problem (i), namely the performance evaluation issue, and in particular we discuss the application of a particular modelling technique to the problem of counting volcanoes on the surface of Venus.


Predicting the Risk of Complications in Coronary Artery Bypass Operations using Neural Networks

Neural Information Processing Systems

MLP networks provided slightly better risk prediction than conventional logistic regression when used to predict the risk of death, stroke, and renal failure on 1257 patients who underwent coronary artery bypass operations. Bootstrap sampling was required to compare approaches and regularization provided by early stopping was an important component of improved performance. A simplified approach to generating confidence intervals for MLP risk predictions using an auxiliary "confidence MLP" was also developed. The confidence MLP is trained to reproduce the confidence bounds that were generated during training by 50 MLP networks trained using bootstrap samples. Current research is validating these results using larger data sets, exploring approaches to detect outlier patients who are so different from any training patient that accurate risk prediction is suspect, developing approaches to explaining which input features are important for an individual patient, and determining why MLP networks provide improved performance.


Coarse-to-Fine Image Search Using Neural Networks

Neural Information Processing Systems

The efficiency of image search can be greatly improved by using a coarse-to-fine search strategy with a multi-resolution image representation. However, if the resolution is so low that the objects have few distinguishing features, search becomes difficult. We show that the performance of search at such low resolutions can be improved by using context information, i.e., objects visible at low-resolution which are not the objects of interest but are associated with them. The networks can be given explicit context information as inputs, or they can learn to detect the context objects, in which case the user does not have to be aware of their existence. We also use Integrated Feature Pyramids, which represent high-frequency information at low resolutions. The use of multiresolution search techniques allows us to combine information about the appearance of the objects on many scales in an efficient way. A natural fOlm of exemplar selection also arises from these techniques. We illustrate these ideas by training hierarchical systems of neural networks to find clusters of buildings in aerial photographs of farmland.


Inferring Ground Truth from Subjective Labelling of Venus Images

Neural Information Processing Systems

Instead of "ground truth" one may only have the subjective opinion(s) of one or more experts. For example, medical data or image data may be collected off-line and some time later a set of experts analyze the data and produce a set of class labels. The central problem is that of trying to infer the "ground truth" given the noisy subjective estimates of the experts. When one wishes to apply a supervised learning algorithm to the data, the problem is primarily twofold: (i) how to evaluate the relative performance of experts and algorithms, and (ii) how to train a pattern recognition system in the absence of absolute ground truth. In this paper we focus on problem (i), namely the performance evaluation issue, and in particular we discuss the application of a particular modelling technique to the problem of counting volcanoes on the surface of Venus.