AITopics | arXiv.org Machine Learning

Plotting

arXiv.org Machine Learning

Classification by Set Cover: The Prototype Vector Machine

arXiv.org Machine LearningAug-17-2009

We introduce a new nearest-prototype classifier, the prototype vector machine (PVM). It arises from a combinatorial optimization problem which we cast as a variant of the set cover problem. We propose two algorithms for approximating its solution. The PVM selects a relatively small number of representative points which can then be used for classification. It contains 1-NN as a special case. The method is compatible with any dissimilarity measure, making it amenable to situations in which the data are not embedded in an underlying feature space or in which using a non-Euclidean metric is desirable. Indeed, we demonstrate on the much studied ZIP code data how the PVM can reap the benefits of a problem-specific metric. In this example, the PVM outperforms the highly successful 1-NN with tangent distance, and does so retaining fewer than half of the data points. This example highlights the strengths of the PVM in yielding a low-error, highly interpretable model. Additionally, we apply the PVM to a protein classification problem in which a kernel-based distance is used.

artificial intelligence, health & medicine, prototype, (17 more...)

arXiv.org Machine Learning

0908.2284

Country: North America > United States > California > Santa Clara County (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Statistical ranking and combinatorial Hodge theory

Jiang, Xiaoye, Lim, Lek-Heng, Yao, Yuan, Ye, Yinyu

arXiv.org Machine LearningAug-10-2009

We propose a number of techniques for obtaining a global ranking from data that may be incomplete and imbalanced -- characteristics almost universal to modern datasets coming from e-commerce and internet applications. We are primarily interested in score or rating-based cardinal data. From raw ranking data, we construct pairwise rankings, represented as edge flows on an appropriate graph. Our statistical ranking method uses the graph Helmholtzian, the graph theoretic analogue of the Helmholtz operator or vector Laplacian, in much the same way the graph Laplacian is an analogue of the Laplace operator or scalar Laplacian. We study the graph Helmholtzian using combinatorial Hodge theory: we show that every edge flow representing pairwise ranking can be resolved into two orthogonal components, a gradient flow that represents the L2-optimal global ranking and a divergence-free flow (cyclic) that measures the validity of the global ranking obtained -- if this is large, then the data does not have a meaningful global ranking. This divergence-free flow can be further decomposed orthogonally into a curl flow (locally cyclic) and a harmonic flow (locally acyclic but globally cyclic); these provides information on whether inconsistency arises locally or globally. An obvious advantage over the NP-hard Kemeny optimization is that discrete Hodge decomposition may be computed via a linear least squares regression. We also investigated the L1-projection of edge flows, showing that this is dual to correlation maximization over bounded divergence-free flows, and the L1-approximate sparse cyclic ranking, showing that this is dual to correlation maximization over bounded curl-free flows. We discuss relations with Kemeny optimization, Borda count, and Kendall-Smith consistency index from social choice theory and statistics.

artificial intelligence, banking & finance, ranking, (20 more...)

arXiv.org Machine Learning

0811.1067

Country:

North America > United States > California > Santa Clara County (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report > New Finding (0.92)

Industry:

Banking & Finance (1.00)
Leisure & Entertainment (0.94)
Media > Film (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.34)

Add feedback

Discrete Temporal Models of Social Networks

Hanneke, Steve, Fu, Wenjie, Xing, Eric

arXiv.org Machine LearningAug-9-2009

We propose a family of statistical models for social network evolution over time, which represents an extension of Exponential Random Graph Models (ERGMs). Many of the methods for ERGMs are readily adapted for these models, including maximum likelihood estimation algorithms. We discuss models of this type and their properties, and give examples, as well as a demonstration of their use for hypothesis testing and classification. We believe our temporal ERG models represent a useful new framework for modeling time-evolving social networks, and rewiring networks from other domains such as gene regulation circuitry, and communication networks.

artificial intelligence, bayesian inference, statistics, (21 more...)

arXiv.org Machine Learning

0908.1258

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Services (0.82)
Government (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Streamed Learning: One-Pass SVMs

Rai, Piyush, Daumé, Hal III, Venkatasubramanian, Suresh

arXiv.org Machine LearningAug-4-2009

We present a streaming model for large-scale classification (in the context of $\ell_2$-SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The $\ell_2$-SVM is known to have an equivalent formulation in terms of the minimum enclosing ball (MEB) problem, and an efficient algorithm based on the idea of \emph{core sets} exists (Core Vector Machine, CVM). CVM learns a $(1+\varepsilon)$-approximate MEB for a set of points and yields an approximate solution to corresponding SVM instance. However CVM works in batch mode requiring multiple passes over the data. This paper presents a single-pass SVM which is based on the minimum enclosing ball of streaming data. We show that the MEB updates for the streaming case can be easily adapted to learn the SVM weight vector in a way similar to using online stochastic gradient updates. Our algorithm performs polylogarithmic computation at each example, and requires very small and constant storage. Experimental results show that, even in such restrictive settings, we can learn efficiently in just one pass and get accuracies comparable to other state-of-the-art SVM solvers (batch and online). We also give an analysis of the algorithm, and discuss some open issues and possible extensions.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Machine Learning

0908.0572

Country: North America > United States (0.49)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.87)

Add feedback

The Infinite Hierarchical Factor Regression Model

Rai, Piyush, Daumé, Hal III

arXiv.org Machine LearningAug-4-2009

We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman's coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis.

health & medicine, matrix, oncology, (16 more...)

arXiv.org Machine Learning

0908.0570

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area (0.50)
Health & Medicine > Pharmaceuticals & Biotechnology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.85)

Add feedback

How the initialization affects the stability of the k-means algorithm

Bubeck, Sebastien, Meila, Marina, von Luxburg, Ulrike

arXiv.org Machine LearningJul-31-2009

We investigate the role of the initialization for the stability of the k-means clustering algorithm. As opposed to other papers, we consider the actual k-means algorithm and do not ignore its property of getting stuck in local optima. We are interested in the actual clustering, not only in the costs of the solution. We analyze when different initializations lead to the same local optimum, and when they lead to different local optima. This enables us to prove that it is reasonable to select the number of clusters based on stability scores.

artificial intelligence, k-means algorithm, machine learning, (18 more...)

arXiv.org Machine Learning

0907.5494

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)

Add feedback

PDE-Foam - a probability-density estimation method using self-adapting phase-space binning

Dannheim, Dominik, Carli, Tancredi, Grahn, Karl-Johan, Speckmayer, Peter, Voigt, Alexander

arXiv.org Machine LearningJul-22-2009

Probability Density Estimation (PDE) is a multivariate discrimination technique based on sampling signal and background densities defined by event samples from data or Monte-Carlo (MC) simulations in a multi-dimensional phase space. In this paper, we present a modification of the PDE method that uses a self-adapting binning method to divide the multi-dimensional phase space in a finite number of hyper-rectangles (cells). The binning algorithm adjusts the size and position of a predefined number of cells inside the multi-dimensional phase space, minimising the variance of the signal and background densities inside the cells. The implementation of the binning algorithm PDE-Foam is based on the MC event-generation package Foam. We present performance results for representative examples (toy models) and discuss the dependence of the obtained results on the choice of parameters. The new PDE-Foam shows improved classification capability for small training samples and reduced classification time compared to the original PDE method based on range searching.

artificial intelligence, foam, training sample, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.nima.2009.05.028

0812.0922

Country:

North America > United States (0.14)
Europe > Switzerland (0.14)
Europe > Sweden (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)

Add feedback

Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks

Hausser, Jean, Strimmer, Korbinian

arXiv.org Machine LearningJul-22-2009

We present a procedure for effective estimation of entropy and mutual information from small-sample data, and apply it to the problem of inferring high-dimensional gene association networks. Specifically, we develop a James-Stein-type shrinkage estimator, resulting in a procedure that is highly efficient statistically as well as computationally. Despite its simplicity, we show that it outperforms eight other entropy estimation procedures across a diverse range of sampling scenarios and data-generating models, even in cases of severe undersampling. We illustrate the approach by analyzing E. coli gene expression data and computing an entropy-based gene-association network from gene expression data. A computer program is available that implements the proposed shrinkage estimator.

bayesian inference, estimator, health & medicine, (18 more...)

arXiv.org Machine Learning

0811.3579

Country:

North America > United States (0.68)
Europe > Germany (0.46)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback

Empirical Bernstein Bounds and Sample Variance Penalization

Maurer, Andreas, Pontil, Massimiliano

arXiv.org Machine LearningJul-21-2009

We give improved constants for data dependent and variance sensitive confidence bounds, called empirical Bernstein bounds, and extend these inequalities to hold uniformly over classes of functionswhose growth function is polynomial in the sample size n. The bounds lead us to consider sample variance penalization, a novel learning method which takes into account the empirical variance of the loss function. We give conditions under which sample variance penalization is effective. In particular, we present a bound on the excess risk incurred by the method. Using this, we argue that there are situations in which the excess risk of our method is of order 1/n, while the excess risk of empirical risk minimization is of order 1/sqrt/{n}. We show some experimental results, which confirm the theory. Finally, we discuss the potential application of our results to sample compression schemes.

artificial intelligence, hypothesis, machine learning, (14 more...)

arXiv.org Machine Learning

0907.3740

Country: North America > United States > California > Santa Cruz County > Santa Cruz (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Inter Genre Similarity Modelling For Automatic Music Genre Classification

Bagci, Ulas, Erzin, Engin

arXiv.org Machine LearningJul-18-2009

Music genre classification is an essential tool for music information retrieval systems and it has been finding critical applications in various media platforms. Two important problems of the automatic music genre classification are feature extraction and classifier design. This paper investigates inter-genre similarity modelling (IGS) to improve the performance of automatic music genre classification. Inter-genre similarity information is extracted over the mis-classified feature population. Once the inter-genre similarity is modelled, elimination of the inter-genre similarity reduces the inter-genre confusion and improves the identification rates. Inter-genre similarity modelling is further improved with iterative IGS modelling(IIGS) and score modelling for IGS elimination(SMIGS). Experimental results with promising classification improvements are provided.

artificial intelligence, genre classification, natural language, (19 more...)

arXiv.org Machine Learning

0907.3220

Country: Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.90)

Add feedback