Statistical Learning
On Kernel-Target Alignment
Cristianini, Nello, Shawe-Taylor, John, Elisseeff, André, Kandola, Jaz S.
We introduce the notion of kernel-alignment, a measure of similarity betweentwo kernel functions or between a kernel and a target function. This quantity captures the degree of agreement between a kernel and a given learning task, and has very natural interpretations inmachine learning, leading also to simple algorithms for model selection and learning. We analyse its theoretical properties, proving that it is sharply concentrated around its expected value, and we discuss its relation with other standard measures of performance. Finallywe describe some of the algorithms that can be obtained within this framework, giving experimental results showing thatadapting the kernel to improve alignment on the labelled data significantly increases the alignment on the test set, giving improved classification accuracy. Hence, the approach provides a principled method of performing transduction.
The Noisy Euclidean Traveling Salesman Problem and Learning
Braun, Mikio L., Buhmann, Joachim M.
We consider noisy Euclidean traveling salesman problems in the plane, which are random combinatorial problems with underlying structure. Gibbs sampling is used to compute average trajectories, which estimate the underlying structure common to all instances. This procedure requires identifying the exact relationship between permutations and tours. In a learning setting, the average trajectory isused as a model to construct solutions to new instances sampled from the same source. Experimental results show that the average trajectory can in fact estimate the underlying structure and that overfitting effects occur if the trajectory adapts too closely to a single instance.
Classifying Single Trial EEG: Towards Brain Computer Interfacing
Blankertz, Benjamin, Curio, Gabriel, Müller, Klaus-Robert
Driven by the progress in the field of single-trial analysis of EEG, there is a growing interest in brain computer interfaces (BCIs), i.e., systems that enable human subjects to control a computer only by means of their brain signals. In a pseudo-online simulation our BCI detects upcoming finger movements in a natural keyboard typing condition and predicts their laterality. Thiscan be done on average 100-230 ms before the respective key is actually pressed, i.e., long before the onset of EMG. Our approach is appealing for its short response time and high classification accuracy ( 96%) in a binary decision where no human training is involved. We compare discriminative classifiers like Support Vector Machines (SVMs) and different variants of Fisher Discriminant that possess favorable regularization propertiesfor dealing with high noise cases (inter-trial variablity).
Constructing Distributed Representations Using Additive Clustering
If the promise of computational modeling is to be fully realized in higherlevel cognitivedomains such as language processing, principled methods must be developed to construct the semantic representations used in such models. In this paper, we propose the use of an established formalism from mathematical psychology, additive clustering, as a means of automatically constructingbinary representations for objects using only pairwise similarity data. However, existing methods for the unsupervised learning of additive clustering models do not scale well to large problems. Wepresent a new algorithm for additive clustering, based on a novel heuristic technique for combinatorial optimization. The algorithm is simpler than previous formulations and makes fewer independence assumptions. Extensiveempirical tests on both human and synthetic data suggest that it is more effective than previous methods and that it also scales better to larger problems. By making additive clustering practical, we take a significant step toward scaling connectionist models beyond hand-coded examples.
Information Self-Service with a Knowledge Base That Learns
Durbin, Stephen D., Warner, Doug, Richter, J. Neal, Gedeon, Zuzana
Delivering effective customer service over the internet requires attention to many aspects of knowledge management if it is to be both satisfying for customers and economical for the company or other organization. In RightNow ESERVICE CENTER, such management is built into the architecture and supported by automatically gathering metainformation about the documents held in the core knowledge base. A variety of AI techniques are used to facilitate the construction, maintenance, and navigation of the knowledge base. These techniques include collaborative filtering, swarm intelligence, fuzzy logic, natural language processing, text clustering, and classification rule learning. Customers using ESERVICE CENTER report dramatic decreases in support costs and increases in customer satisfaction because of the ease of use provided by the self-learning features of the knowledge base.
Specific-to-General Learning for Temporal Events with Application to Learning Event Definitions from Video
Fern, A., Givan, R., Siskind, J. M.
We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that is sufficiently expressive to represent many events yet sufficiently restrictive to support learning. We then give algorithms, along with lower and upper complexity bounds, for the subsumption and generalization problems for AMA formulas. We present a positive-examples--only specific-to-general learning method based on these algorithms. We also present a polynomial-time--computable ``syntactic'' subsumption test that implies semantic subsumption without being equivalent to it. A generalization algorithm based on syntactic subsumption can be used in place of semantic generalization to improve the asymptotic complexity of the resulting learning algorithm. Finally, we apply this algorithm to the task of learning relational event definitions from video and show that it yields definitions that are competitive with hand-coded ones.
Support Vector Machines and Kernel Methods: The New Generation of Learning Machines
Cristianini, Nello, Scholkopf, Bernhard
Kernel methods, a new generation of learning algorithms, utilize techniques from optimization, statistics, and functional analysis to achieve maximal generality, flexibility, and performance. These algorithms are different from earlier techniques used in machine learning in many respects: For example, they are explicitly based on a theoretical model of learning rather than on loose analogies with natural learning systems or other heuristics. Although the research is not concluded, already now kernel methods are considered the state of the art in several machine learning tasks. Their ease of use, theoretical appeal, and remarkable performance have made them the system of choice for many learning problems.
Support Vector Machines and Kernel Methods: The New Generation of Learning Machines
Cristianini, Nello, Scholkopf, Bernhard
Kernel methods, a new generation of learning algorithms, utilize techniques from optimization, statistics, and functional analysis to achieve maximal generality, flexibility, and performance. These algorithms are different from earlier techniques used in machine learning in many respects: For example, they are explicitly based on a theoretical model of learning rather than on loose analogies with natural learning systems or other heuristics. They come with theoretical guarantees about their performance and have a modular design that makes it possible to separately implement and analyze their components. They are not affected by the problem of local minima because their training amounts to convex optimization. In the last decade, a sizable community of theoreticians and practitioners has formed around these methods, and a number of practical applications have been realized. Although the research is not concluded, already now kernel methods are considered the state of the art in several machine learning tasks. Their ease of use, theoretical appeal, and remarkable performance have made them the system of choice for many learning problems. Successful applications range from text categorization to handwriting recognition to classification of geneexpression data.
Support Vector Novelty Detection Applied to Jet Engine Vibration Spectra
Hayton, Paul M., Schölkopf, Bernhard, Tarassenko, Lionel, Anuzis, Paul
A system has been developed to extract diagnostic information from jet engine carcass vibration data. Support Vector Machines applied to novelty detection provide a measure of how unusual the shape of a vibration signature is, by learning a representation of normality. We describe a novel method for Support Vector Machines of including information from a second class for novelty detection and give results from the application to Jet Engine vibration analysis.