AITopics

Local linear regression performs very well in many low-dimensional forecasting problems. In high-dimensional spaces, its performance typically decays due to the well-known "curse-of-dimensionality". A possible way to approach this problem is by varying the "shape" of the weighting kernel. In this work we suggest a new, data-driven method to estimating the optimal kernel shape. Experiments using anartificially generated data set and data from the UC Irvine repository show the benefits of kernel shaping. 1 Introduction Local linear regression has attracted considerable attention in both statistical and machine learning literature as a flexible tool for nonparametric regression analysis [Cle79, FG96, AMS97]. Like most statistical smoothing approaches, local modeling suffers from the so-called "curse-of-dimensionality", the well-known fact that the proportion of the training data that lie in a fixed-radius neighborhood of a point decreases to zero at an exponential rate with increasing dimension of the input space.

artificial intelligence, kernel, machine learning, (13 more...)

Genre: Research Report > New Finding (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Platt, John C., Cristianini, Nello, Shawe-Taylor, John

Large Margin DAGs for Multiclass Classification

We present a new learning architecture: the Decision Directed Acyclic Graph (DDAG), which is used to combine many two-class classifiers into a multiclass classifier. For an N -class problem, the DDAG contains N(N-1)/2 classifiers, one for each pair of classes. We present a VC analysis of the case when the node classifiers are hyperplanes; the resulting boundon the test error depends on N and on the margin achieved at the nodes, but not on the dimension of the space. This motivates an algorithm, DAGSVM, which operates in a kernel-induced feature space and uses two-class maximal margin hyperplanes at each decision-node of the DDAG. The DAGSVM is substantially faster to train and evaluate thaneither the standard algorithm or Max Wins, while maintaining comparable accuracy to both of these algorithms. 1 Introduction The problem of multiclass classificatIon, especially for systems like SVMs, doesn't present an easy solution. It is generally simpler to construct classifier theory and algorithms for two mutually-exclusive classes than for N mutually-exclusive classes.

algorithm, artificial intelligence, machine learning, (18 more...)

Country: North America > United States (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Jaakkola, Tommi, Meila, Marina, Jebara, Tony

Maximum Entropy Discrimination

We present a general framework for discriminative estimation based on the maximum entropy principle and its extensions. All calculations involvedistributions over structures and/or parameters rather than specific settings and reduce to relative entropy projections. This holds even when the data is not separable within the chosen parametric class, in the context of anomaly detection rather than classification, or when the labels in the training set are uncertain or incomplete. Support vector machines are naturally subsumed under thisclass and we provide several extensions. We are also able to estimate exactly and efficiently discriminative distributions over tree structures of class-conditional models within this framework.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.16)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.59)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Crisp, David J., Burges, Christopher J. C.

A Geometric Interpretation of v-SVM Classifiers

We show that the recently proposed variant of the Support Vector machine (SVM) algorithm, known as v-SVM, can be interpreted as a maximal separation between subsets of the convex hulls of the data, which we call soft convex hulls. The soft convex hulls are controlled by choice of the parameter v. If the intersection of the convex hulls is empty, the hyperplane is positioned halfway between them such that the distance between convex hulls, measured along the normal, is maximized; and if it is not, the hyperplane's normal is similarly determined by the soft convex hulls, but its position (perpendicular distance from the origin) is adjusted to minimize the error sum. The proposed geometric interpretation of v-SVM also leads to necessary and sufficient conditions for the existence of a choice of v for which the v-SVM solution is nontrivial. 1 Introduction Recently, SchOlkopf et al. [I) introduced a new class of SVM algorithms, called v-SVM, for both regression estimation and pattern recognition. The basic idea is to remove the user-chosen error penalty factor C that appears in SVM algorithms by introducing a new variable p which, in the pattern recognition case, adds another degree of freedom to the margin.

artificial intelligence, convex hull, machine learning, (13 more...)

Country: Oceania > Australia > South Australia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Schölkopf, Bernhard, Williamson, Robert C., Smola, Alex J., Shawe-Taylor, John, Platt, John C.

Support Vector Method for Novelty Detection

Suppose you are given some dataset drawn from an underlying probability distributionP and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified

artificial intelligence, data mining, machine learning, (18 more...)

Country: North America > United States (0.94)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.66)

Vaithyanathan, Shivakumar, Dom, Byron

Generalized Model Selection for Unsupervised Learning in High Dimensions

We then evaluate this scheme (based on marginal likelihood) and one based on cross-validated likelihood.

artificial intelligence, machine learning, selection, (15 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Liao, Yuansong, Moody, John E.

Constructing Heterogeneous Committees Using Input Feature Grouping: Application to Economic Forecasting

Yuansong Liao and John Moody Department of Computer Science, Oregon Graduate Institute, P.O.Box 91000, Portland, OR 97291-1000 Abstract The committee approach has been proposed for reducing model uncertainty and improving generalization performance. The advantage ofcommittees depends on (1) the performance of individual members and (2) the correlational structure of errors between members. This paper presents an input grouping technique for designing aheterogeneous committee. With this technique, all input variables are first grouped based on their mutual information. Statistically similarvariables are assigned to the same group.

artificial intelligence, committee member, machine learning, (17 more...)

Country: North America > United States > Oregon > Multnomah County > Portland (0.24)

Genre: Research Report (0.69)

Industry: Banking & Finance > Economy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Forecasting (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization

Hofmann, Thomas

The project pursued in this paper is to develop from first information-geometric principles a general method for learning the similarity between text documents.

machine learning, natural language, similarity function, (16 more...)

Country: North America > United States (0.69)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Bartlett, Marian Stewart, Donato, Gianluca, Movellan, Javier R., Hager, Joseph C., Ekman, Paul, Sejnowski, Terrence J.

Image Representations for Facial Expression Coding

The Facial Action Coding System (FACS) (9) is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding ispresently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facialactions in sequences of images. These methods include unsupervised learning techniques for finding basis images such as principal component analysis, independent component analysis and local feature analysis, and supervised learning techniques such as Fisher's linear discriminants.

artificial intelligence, machine learning, representation, (17 more...)

Country:

North America > United States > California > San Diego County (0.15)
North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.46)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Yang, Ming-Hsuan, Roth, Dan, Ahuja, Narendra

A SNoW-Based Face Detector

A novel learning approach for human face detection using a network of linear units is presented. The SNoW learning architecture is a sparse network of linear functions over a predefined or incrementally learnedfeature space and is specifically tailored for learning in the presence of a very large number of features. A wide range of face images in different poses, with different expressions and under different lighting conditions are used as a training set to capture the variations of human faces. Experimental results on commonly used benchmark data sets of a wide range of face images show that the SNoW-based approach outperforms methods that use neural networks, Bayesian methods, support vector machines and others. Furthermore,learning and evaluation using the SNoW-based method are significantly more efficient than with other methods. 1 Introduction Growing interest in intelligent human computer interactions has motivated a recent surge in research on problems such as face tracking, pose estimation, face expression and gesture recognition. Most methods, however, assume human faces in their input images have been detected and localized.

artificial intelligence, detection, machine learning, (14 more...)

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)