AITopics | Technology

Nearest neighbor classification assumes locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with finite samples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. We propose a locally adaptive nearest neighbor classification method to try to minimize bias. We use a Chi-squared distance analysis to compute a flexible metric for producing neighborhoods that are elongated along less relevant feature dimensions and constricted along most influential ones. As a result, the class conditional probabilities tend to be smoother in the modified neighborhoods, whereby better classification performance can be achieved. The efficacy of our method is validated and compared against other techniques using a variety of real world data. 1 Introduction

error rate, neighborhood, probability, (15 more...)

Country:

North America > United States > Oklahoma > Payne County > Stillwater (0.14)
North America > United States > California > Riverside County > Riverside (0.14)
North America > United States > South Carolina > Beaufort County > Hilton Head Island (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Dayan, Peter, Kakade, Sham

Explaining Away in Weight Space

Explaining away has mostly been considered in terms of inference of states in belief networks. We show how it can also arise in a Bayesian context in inference about the weights governing relationships such as those between stimuli and reinforcers in conditioning experiments such as bacA, 'Ward blocking. We show how explaining away in weight space can be accounted for using an extension of a Kalman filter model; provide a new approximate way of looking at the Kalman gain matrix as a whitener for the correlation matrix of the observation process; suggest a network implementation of this whitener using an architecture due to Goodall; and show that the resulting model exhibits backward blocking.

conditioning, covariance matrix, kalman filter, (14 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Cohn, David A., Hofmann, Thomas

The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity

We describe a joint probabilistic model for modeling the contents and inter-connectivity of document collections such as sets of web pages or research paper archives. The model is based on a probabilistic factor decomposition and allows identifying principal topics of the collection as well as authoritative documents within those topics. Furthermore, the relationships between topics is mapped out in order to build a predictive model of link content. Among the many applications of this approach are information retrieval and search, topic identification, query disambiguation, focused web crawling, web authoring, and bibliometric analysis.

joint model, probability, reference flow, (14 more...)

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Chen, Scott Saobing, Gopinath, Ramesh A.

Gaussianization

High dimensional data modeling is difficult mainly because the so-called "curse of dimensionality". We propose a technique called "Gaussianization" for high dimensional density estimation, which alleviates the curse of dimensionality by exploiting the independence structures in the data. Gaussianization is motivated from recent developments in the statistics literature: projection pursuit, independent component analysis and Gaussian mixture models with semi-tied covariances. We propose an iterative Gaussianization procedure which converges weakly: at each iteration, the data is first transformed to the least dependent coordinates and then each coordinate is marginally Gaussianized by univariate techniques. Gaussianization offers density estimation sharper than traditional kernel methods and radial basis function methods.

algorithm, gaussianization, iteration, (13 more...)

Country: North America > United States > New York > Suffolk County > East Setauket (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Chapelle, Olivier, Weston, Jason, Bottou, Léon, Vapnik, Vladimir

Vicinal Risk Minimization

The Vicinal Risk Minimization principle establishes a bridge between generative models and methods derived from the Structural Risk Minimization Principle such as Support Vector Machines or Statistical Regularization. We explain how VRM provides a framework which integrates a number of existing algorithms, such as Parzen windows, Support Vector Machines, Ridge Regression, Constrained Logistic Classifiers and Tangent-Prop. We then show how the approach implies new algorithms for solving problems usually associated with generative models. New algorithms are described for dealing with pattern recognition problems with very different pattern distributions and dealing with unlabeled data. Preliminary empirical results are presented.

algorithm, logistic classifier, vrm, (11 more...)

Country:

North America > United States > Wisconsin (0.05)
North America > United States > New York (0.04)
North America > United States > Georgia > Chatham County > Savannah (0.04)
North America > United States > Colorado > Denver County > Denver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Cauwenberghs, Gert, Poggio, Tomaso

Incremental and Decremental Support Vector Machine Learning

An online recursive algorithm for training support vector machines, one vector at a time, is presented. Adiabatic increments retain the Kuhn Tucker conditions on all previously seen training data, in a number of steps each computed analytically. The incremental procedure is reversible, and decremental "unlearning" offers an efficient method to exactly evaluate leave-one-out generalization performance.

error vector, generalization performance, vector, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.16)
North America > United States > New York (0.04)
North America > United States > Maryland > Baltimore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Campbell, Colin, Bennett, Kristin P.

A Linear Programming Approach to Novelty Detection

Novelty detection involves modeling the normal behaviour of a system hence enabling detection of any divergence from normality. It has potential applications in many areas such as detection of machine damage or highlighting abnormal features in medical data. One approach is to build a hypothesis estimating the support of the normal data i.e. constructing a function which is positive in the region where the data is located and negative elsewhere. Recently kernel methods have been proposed for estimating the support of a distribution and they have performed well in practice - training involves solution of a quadratic programming problem. In this paper we propose a simpler kernel method for estimating the support based on linear programming. The method is easy to implement and can learn large datasets rapidly. We demonstrate the method on medical and fault detection datasets.

detection, feature space, input space, (11 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.31)

Cadez, Igor V., Smyth, Padhraic

Model Complexity, Goodness of Fit and Diminishing Returns

Such learning tasks can typically be characterized by the existence of a model and a loss function. A fitted model of complexity k is a function of the data points D and depends on a specific set of fitted parameters B. The loss function (goodnessof-fit) is a functional of the model and maps each specific model to a scalar used to evaluate the model, e.g., likelihood for density estimation or sum-of-squares for regression. Figure 1 illustrates a typical empirical curve for loss function versus complexity, for mixtures of Markov models fitted to a large data set of 900,000 sequences. The complexity k is the number of Markov models being used in the mixture (see Cadez et al. (2000) for further details on the model and the data set). The empirical curve has a distinctly concave appearance, with large relative gains in fit for low complexity models and much more modest relative gains for high complexity models.

linear regression, loss function, regression, (12 more...)

Country:

North America > United States > California > Orange County > Irvine (0.14)
North America > United States > Washington > King County > Redmond (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)

Direct Classification with Indirect Data

Brown, Timothy X.

Suppose there exists an unknown real-valued property of the feature space, p(¢), that maps from the feature space, ¢ ERn, to R. The property function and a positive set A c

classifier, consistent estimator, property function, (12 more...)

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.05)
North America > United States > Colorado > Boulder County > Boulder (0.04)

Industry: Telecommunications (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.33)

Bhattacharyya, Chiranjib, Keerthi, S. Sathiya

A Variational Mean-Field Theory for Sigmoidal Belief Networks

In this paper we will discuss a variational mean-field theory and its application to BNs, sigmoidal BNs in particular. We present a variational derivation of the mean-field theory, proposed by Plefka[2].

approximation, mean-field theory, plefka, (15 more...)

Country:

Asia > Middle East > Jordan (0.06)
Asia > Singapore (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.43)