AITopics

0811.0340

Country:

Europe > France (0.28)
North America > United States > California > San Mateo County > Menlo Park (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Jiang, Wenxin, Tanner, Martin A.

Gibbs posterior for variable selection in high-dimensional classification and data mining

arXiv.org Machine LearningOct-31-2008

In the popular approach of "Bayesian variable selection" (BVS), one uses prior and posterior distributions to select a subset of candidate variables to enter the model. A completely new direction will be considered here to study BVS with a Gibbs posterior originating in statistical mechanics. The Gibbs posterior is constructed from a risk function of practical interest (such as the classification error) and aims at minimizing a risk function without modeling the data probabilistically. This can improve the performance over the usual Bayesian approach, which depends on a probability model which may be misspecified. Conditions will be provided to achieve good risk performance, even in the presence of high dimensionality, when the number of candidate variables "$K$" can be much larger than the sample size "$n$." In addition, we develop a convenient Markov chain Monte Carlo algorithm to implement BVS with the Gibbs posterior.

artificial intelligence, bayesian inference, machine learning, (16 more...)

doi: 10.1214/07-AOS547

0810.5655

Country: North America > United States (0.67)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

arXiv.org Artificial IntelligenceOct-30-2008

A Novel Clustering Algorithm Based on a Modified Model of Random Walk

Li, Qiang, He, Yan, Jiang, Jing-ping

We introduce a modified model of random walk, and then develop two novel clustering algorithms based on it. In the algorithms, each data point in a dataset is considered as a particle which can move at random in space according to the preset rules in the modified model. Further, this data point may be also viewed as a local control subsystem, in which the controller adjusts its transition probability vector in terms of the feedbacks of all data points, and then its transition direction is identified by an event-generating function. Finally, the positions of all data points are updated. As they move in space, data points collect gradually and some separating parts emerge among them automatically. As a consequence, data points that belong to the same class are located at a same position, whereas those that belong to different classes are away from one another. Moreover, the experimental results have demonstrated that data points in the test datasets are clustered reasonably and efficiently, and the comparison with other algorithms also provides an indication of the effectiveness of the proposed algorithms.

artificial intelligence, data mining, machine learning, (14 more...)

0810.5484

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.74)

Hall, Peter, Park, Byeong U., Samworth, Richard J.

Choice of neighbor order in nearest-neighbor classification

arXiv.org Machine LearningOct-29-2008

The $k$th-nearest neighbor rule is arguably the simplest and most intuitively appealing nonparametric classification procedure. However, application of this method is inhibited by lack of knowledge about its properties, in particular, about the manner in which it is influenced by the value of $k$; and by the absence of techniques for empirical choice of $k$. In the present paper we detail the way in which the value of $k$ determines the misclassification error. We consider two models, Poisson and Binomial, for the training samples. Under the first model, data are recorded in a Poisson stream and are "assigned" to one or other of the two populations in accordance with the prior probabilities. In particular, the total number of data in both training samples is a Poisson-distributed random variable. Under the Binomial model, however, the total number of data in the training samples is fixed, although again each data value is assigned in a random way. Although the values of risk and regret associated with the Poisson and Binomial models are different, they are asymptotically equivalent to first order, and also to the risks associated with kernel-based classifiers that are tailored to the case of two derivatives. These properties motivate new methods for choosing the value of $k$.

artificial intelligence, classifier, machine learning, (18 more...)

doi: 10.1214/07-AOS537

0810.5276

Country:

North America > United States (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.65)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.31)

von Luxburg, Ulrike, Schoelkopf, Bernhard

Statistical Learning Theory: Models, Concepts, and Results

arXiv.org Machine LearningOct-27-2008

Statistical learning theory provides the theoretical basis for many of today's machine learning algorithms and is arguably one of the most beautifully developed branches of artificial intelligence in general. It originated in Russia in the 1960s and gained wide popularity in the 1990s following the development of the so-called Support Vector Machine (SVM), which has become a standard tool for pattern recognition in a variety of domains ranging from computer vision to computational biology. Providing the basis of new learning algorithms, however, was not the only motivation for developing statistical learning theory. It was just as much a philosophical one, attempting to answer the question of what it is that allows us to draw valid conclusions from empirical data. In this article we attempt to give a gentle, nontechnical overview over the key ideas and insights of statistical learning theory. We do not assume that the reader has a deep background in mathematics, statistics, or computer science. Given the nature of the subject matter, however, some familiarity with mathematical concepts and notations and some intuitive understanding of basic probability is required. There exist many excellent references to more technical surveys of the mathematics of statistical learning theory: the monographs by one of the founders of statistical learning theory (Vapnik, 1995, Vapnik, 1998), a brief overview over statistical learning theory in Section 5 of Schölkopf and Smola (2002), more technical overview papers such as Bousquet et al. (2003), Mendelson (2003), Boucheron et al. (2005), Herbrich and Williamson (2002), and the monograph Devroye et al. (1996).

artificial intelligence, classifier, machine learning, (19 more...)

0810.4752

Country:

Europe (0.87)
North America > United States > Massachusetts (0.28)

Genre:

Research Report (0.63)
Overview (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Pelossof, Raphael, Jones, Michael, Vovsha, Ilia, Rudin, Cynthia

Online Coordinate Boosting

arXiv.org Machine LearningOct-24-2008

We present a new online boosting algorithm for adapting the weights of a boosted classifier, which yields a closer approximation to Freund and Schapire's AdaBoost algorithm than previous online boosting algorithms. We also contribute a new way of deriving the online algorithm that ties together previous online boosting work. We assume that the weak hypotheses were selected beforehand, and only their weights are updated during online boosting. The update rule is derived by minimizing AdaBoost's loss when viewed in an incremental form. The equations show that optimization is computationally expensive. However, a fast online approximation is possible. We compare approximation error to batch AdaBoost on synthetic datasets and generalization error on face datasets and the MNIST dataset.

artificial intelligence, inductive learning, machine learning, (17 more...)

0810.4553

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.30)

Journal of Artificial Intelligence ResearchOct-23-2008

On Similarities between Inference in Game Theory and Machine Learning

Rezek, I., Leslie, D. S., Reece, S., Roberts, S. J., Rogers, A., Dash, R. K., Jennings, N. R.

In this paper, we elucidate the equivalence between inference in game theory and machine learning. Our aim in so doing is to establish an equivalent vocabulary between the two domains so as to facilitate developments at the intersection of both ﬁelds, and as proof of the usefulness of this approach, we use recent developments in each ﬁeld to make useful improvements to the other. More speciﬁcally, we consider the analogies between smooth best responses in ﬁctitious play and Bayesian inference methods. Initially, we use these insights to develop and demonstrate an improved algorithm for learning in games based on probabilistic moderation. That is, by integrating over the distribution of opponent strategies (a Bayesian approach within machine learning) rather than taking a simple empirical average (the approach used in standard ﬁctitious play) we derive a novel moderated ﬁctitious play algorithm and show that it is more likely than standard ﬁctitious play to converge to a payoff-dominant but risk-dominated Nash equilibrium in a simple coordination game. Furthermore we consider the converse case, and show how insights from game theory can be used to derive two improved mean ﬁeld variational learning algorithms. We ﬁrst show that the standard update rule of mean ﬁeld variational learning is analogous to a Cournot adjustment within game theory. By analogy with ﬁctitious play, we then suggest an improved update rule, and show that this results in ﬁctitious variational play, an improved mean ﬁeld variational learning algorithm that exhibits better convergence in highly or strongly connected graphical models. Second, we use a recent advance in ﬁctitious play, namely dynamic ﬁctitious play, to derive a derivative action variational learning algorithm, that exhibits superior convergence properties on a canonical machine learning problem (clustering a mixture distribution).

algorithm, fictitious play, variational algorithm, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.2523

AI Access Foundation

10574

Journal of Artificial Intelligence Research

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Luss, Ronny, d'Aspremont, Alexandre

Clustering and Feature Selection using Sparse Principal Component Analysis

arXiv.org Artificial IntelligenceOct-8-2008

This paper focuses on applications of sparse principal component analysis to clustering and feature selection problems, with a particular focus on gene expression data analysis. Sparse methods have had a significant impact in many areas of statistics, in particular regression and classification (see [CT05], [DT05] and [Vap95] among others). As in these areas, our motivation for developing sparse multivariate visualization tools is the potential of these methods for yielding statistical results that are both more interpretable and more robust than classical analyses, while giving up little statistical efficiency. Principal component analysis (PCA) is a classic tool for analyzing large scale multivariate data. It seeks linear combinations of the data variables (often called factors or principal components) that capture a maximum amount of variance.

artificial intelligence, eigenvalue, machine learning, (17 more...)

0707.0701

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)

Carter, Kevin M., Raich, Raviv, Hero, Alfred O. III

An Information Geometric Framework for Dimensionality Reduction

arXiv.org Machine LearningSep-29-2008

This report concerns the problem of dimensionality reduction through information geometric methods on statistical manifolds. While there has been considerable work recently presented regarding dimensionality reduction for the purposes of learning tasks such as classification, clustering, and visualization, these methods have focused primarily on Riemannian manifolds in Euclidean space. While sufficient for many applications, there are many high-dimensional signals which have no straightforward and meaningful Euclidean representation. In these cases, signals may be more appropriately represented as a realization of some distribution lying on a statistical manifold, or a manifold of probability density functions (PDFs). We present a framework for dimensionality reduction that uses information geometry for both statistical manifold reconstruction as well as dimensionality reduction in the data domain.

data mining, machine learning, manifold, (18 more...)

0809.4866

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)

Xu, Yi-Chun, Xiao, Ren-Bin, Amos, Martyn

Simulated annealing for weighted polygon packing

arXiv.org Artificial IntelligenceSep-29-2008

In this paper we present a new algorithm for a layout optimization problem: this concerns the placement of weighted polygons inside a circular container, the two objectives being to minimize imbalance of mass and to minimize the radius of the container. This problem carries real practical significance in industrial applications (such as the design of satellites), as well as being of significant theoretical interest. Previous work has dealt with circular or rectangular objects, but here we deal with the more realistic case where objects may be represented as polygons and the polygons are allowed to rotate. We present a solution based on simulated annealing and first test it on instances with known optima. Our results show that the algorithm obtains container radii that are close to optimal. We also compare our method with existing algorithms for the (special) rectangular case. Experimental results show that our approach out-performs these methods in terms of solution quality.

artificial intelligence, atan, machine learning, (18 more...)

0809.5005

Genre: Research Report > New Finding (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.62)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.49)