Goto

Collaborating Authors

 Education


Online Passive-Aggressive Algorithms

Neural Information Processing Systems

We present a unified view for online classification, regression, and uniclass problems.This view leads to a single algorithmic framework for the three problems. We prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case. A conversion of our main online algorithm to the setting of batch learning is also discussed. Theend result is new algorithms and accompanying loss bounds for the hinge-loss.


Unsupervised Context Sensitive Language Acquisition from a Large Corpus

Neural Information Processing Systems

We describe a pattern acquisition algorithm that learns, in an unsupervised fashion,a streamlined representation of linguistic structures from a plain natural-language corpus. This paper addresses the issues of learning structuredknowledge from a large-scale natural language data set, and of generalization to unseen text. The implemented algorithm represents sentencesas paths on a graph whose vertices are words (or parts of words). Significant patterns, determined by recursive context-sensitive statistical inference, form new vertices. Linguistic constructions are represented bytrees composed of significant patterns and their associated equivalence classes. An input module allows the algorithm to be subjected toa standard test of English as a Second Language (ESL) proficiency. Theresults are encouraging: the model attains a level of performance consideredto be "intermediate" for 9th-grade students, despite having been trained on a corpus (CHILDES) containing transcribed speech of parents directed to small children.


Online Classification on a Budget

Neural Information Processing Systems

Online algorithms for classification often require vast amounts of memory andcomputation time when employed in conjunction with kernel functions. In this paper we describe and analyze a simple approach for an on-the-fly reduction of the number of past examples used for prediction. Experiments performed with real datasets show that using the proposed algorithmic approach with a single epoch is competitive with the support vectormachine (SVM) although the latter, being a batch algorithm, accesses each training example multiple times.


Learning Curves for Stochastic Gradient Descent in Linear Feedforward Networks

Neural Information Processing Systems

Gradient-following learning methods can encounter problems of implementation inmany applications, and stochastic variants are frequently used to overcome these difficulties. We derive quantitative learning curves for three online training methods used with a linear perceptron: direct gradient descent, node perturbation, and weight perturbation. The maximum learning rate for the stochastic methods scales inversely with the first power of the dimensionality of the noise injected into the system; withsufficiently small learning rate, all three methods give identical learning curves. These results suggest guidelines for when these stochastic methods will be limited in their utility, and considerations for architectures in which they will be effective.


Learning a Rare Event Detection Cascade by Direct Feature Selection

Neural Information Processing Systems

Face detection is a canonical example of a rare event detection problem, inwhich target patterns occur with much lower frequency than nontargets. Outof millions of face-sized windows in an input image, for example, only a few will typically contain a face. Viola and Jones recently proposed a cascade architecture for face detection which successfully addresses therare event nature of the task. A central part of their method is a feature selection algorithm based on AdaBoost. We present a novel cascade learning algorithm based on forward feature selection which is two orders of magnitude faster than the Viola-Jones approach and yields classifiers of equivalent quality. This faster method could be used for more demanding classification tasks, such as online learning.


Online Learning of Non-stationary Sequences

Neural Information Processing Systems

We consider an online learning scenario in which the learner can make predictions on the basis of a fixed set of experts. We derive upper and lower relative loss bounds for a class of universal learning algorithms involving aswitching dynamics over the choice of the experts. On the basis of the performance bounds we provide the optimal a priori discretization forlearning the parameter that governs the switching dynamics. We demonstrate the new algorithm in the context of wireless networks.


Link Prediction in Relational Data

Neural Information Processing Systems

Many real-world domains are relational in nature, consisting of a set of objects related to each other in complex ways. This paper focuses on predicting the existence and the type of links between entities in such domains. We apply the relational Markov network framework of Taskar et al. to define a joint probabilistic modelover the entire link graph -- entity attributes and links. The application of the RMN algorithm to this task requires the definition of probabilistic patterns over subgraph structures. We apply this method to two new relational datasets, one involving university webpages, and the other a social network. We show that the collective classification approach of RMNs, and the introduction of subgraph patterns over link labels, provide significant improvements in accuracy over flat classification, which attempts to predict each link in isolation.



Large Scale Online Learning

Neural Information Processing Systems

We consider situations where training data is abundant and computing resources are comparatively scarce. We argue that suitably designed online learningalgorithms asymptotically outperform any batch learning algorithm. Both theoretical and experimental evidences are presented.


Calendar of Events

AI Magazine

Cognition: The Mathematics of Mind. (CORES 2005). Stefanie Bruninghaus, University of Pittsburgh The ICCBR'05 Program Committee invites submissions of original theoretical research, Industry Day Chair: applied research and deployed application (MDAI 2005). Must have a Masters Degree in Data Mining.