Goto

Collaborating Authors

 Towell, Geoffrey G.


Using Unlabeled Data for Supervised Learning

Neural Information Processing Systems

For example, it is trivial to record hours of heartbeats from hundreds of patients. However, it is expensive to hire cardiologists to label each of the recorded beats. One response to the expense of class labels is to squeeze the most information possible out of each labeled example. Regularization and cross-validation both have this goal. A second response is to start with a small set of labeled examples and request labels of only those currently unlabeled examples that are expected to provide a significant improvement in the behavior of the classifier (Lewis & Catlett, 1994; Freund et al., 1993). A third response is to tap into a largely ignored potential source of information; namely, unlabeled examples. This response is supported by the theoretical work of Castelli and Cover (1995) which suggests that unlabeled examples have value in learning classification problems.


Using Unlabeled Data for Supervised Learning

Neural Information Processing Systems

Geoffrey Towell Siemens Corporate Research 755 College Road East Princeton, NJ 08540 Abstract Many classification problems have the property that the only costly part of obtaining examples is the class label. This paper suggests a simple method for using distribution information contained in unlabeled examples to augment labeled examples in a supervised training framework. Empirical tests show that the technique described inthis paper can significantly improve the accuracy of a supervised learner when the learner is well below its asymptotic accuracy level. 1 INTRODUCTION Supervised learning problems often have the following property: unlabeled examples have little or no cost while class labels have a high cost. For example, it is trivial to record hours of heartbeats from hundreds of patients. However, it is expensive to hire cardiologists to label each of the recorded beats.


Training Knowledge-Based Neural Networks to Recognize Genes in DNA Sequences

Neural Information Processing Systems

Shavlik Computer Sciences University of Wisconsin Madison, WI 53706 We describe the application of a hybrid symbolic/connectionist machine learning algorithm to the task of recognizing important genetic sequences.