AITopics | arXiv.org Machine Learning

Plotting

arXiv.org Machine Learning

Online Coordinate Boosting

Pelossof, Raphael, Jones, Michael, Vovsha, Ilia, Rudin, Cynthia

arXiv.org Machine LearningOct-24-2008

We present a new online boosting algorithm for adapting the weights of a boosted classifier, which yields a closer approximation to Freund and Schapire's AdaBoost algorithm than previous online boosting algorithms. We also contribute a new way of deriving the online algorithm that ties together previous online boosting work. We assume that the weak hypotheses were selected beforehand, and only their weights are updated during online boosting. The update rule is derived by minimizing AdaBoost's loss when viewed in an incremental form. The equations show that optimization is computationally expensive. However, a fast online approximation is possible. We compare approximation error to batch AdaBoost on synthetic datasets and generalization error on face datasets and the MNIST dataset.

algorithm, artificial intelligence, inductive learning, (17 more...)

arXiv.org Machine Learning

0810.4553

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.34)

Add feedback

An Information Geometric Framework for Dimensionality Reduction

Carter, Kevin M., Raich, Raviv, Hero, Alfred O. III

arXiv.org Machine LearningSep-29-2008

This report concerns the problem of dimensionality reduction through information geometric methods on statistical manifolds. While there has been considerable work recently presented regarding dimensionality reduction for the purposes of learning tasks such as classification, clustering, and visualization, these methods have focused primarily on Riemannian manifolds in Euclidean space. While sufficient for many applications, there are many high-dimensional signals which have no straightforward and meaningful Euclidean representation. In these cases, signals may be more appropriately represented as a realization of some distribution lying on a statistical manifold, or a manifold of probability density functions (PDFs). We present a framework for dimensionality reduction that uses information geometry for both statistical manifold reconstruction as well as dimensionality reduction in the data domain.

artificial intelligence, data mining, manifold, (19 more...)

arXiv.org Machine Learning

0809.4866

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (1.00)

Add feedback

Clustering of discretely observed diffusion processes

De Gregorio, Alessandro, Iacus, Stefano Maria

arXiv.org Machine LearningSep-23-2008

In this paper a new dissimilarity measure to identify groups of assets dynamics is proposed. The underlying generating process is assumed to be a diffusion process solution of stochastic differential equations and observed at discrete time. The mesh of observations is not required to shrink to zero. As distance between two observed paths, the quadratic distance of the corresponding estimated Markov operators is considered. Analysis of both synthetic data and real financial data from NYSE/NASDAQ stocks, give evidence that this distance seems capable to catch differences in both the drift and diffusion coefficients contrary to other commonly used metrics.

artificial intelligence, banking & finance, machine learning, (16 more...)

arXiv.org Machine Learning

0809.3902

Country: Europe > Italy (0.28)

Genre: Research Report (0.40)

Industry:

Banking & Finance > Trading (0.54)
Energy > Oil & Gas (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Finding rare objects and building pure samples: Probabilistic quasar classification from low resolution Gaia spectra

Bailer-Jones, C. A. L., Smith, K. W., Tiede, C., Sordo, R., Vallenari, A.

arXiv.org Machine LearningSep-19-2008

We develop and demonstrate a probabilistic method for classifying rare objects in surveys with the particular goal of building very pure samples. It works by modifying the output probabilities from a classifier so as to accommodate our expectation (priors) concerning the relative frequencies of different classes of objects. We demonstrate our method using the Discrete Source Classifier, a supervised classifier currently based on Support Vector Machines, which we are developing in preparation for the Gaia data analysis. DSC classifies objects using their very low resolution optical spectra. We look in detail at the problem of quasar classification, because identification of a pure quasar sample is necessary to define the Gaia astrometric reference frame. By varying a posterior probability threshold in DSC we can trade off sample completeness and contamination. We show, using our simulated data, that it is possible to achieve a pure sample of quasars (upper limit on contamination of 1 in 40,000) with a completeness of 65% at magnitudes of G=18.5, and 50% at G=20.0, even when quasars have a frequency of only 1 in every 2000 objects. The star sample completeness is simultaneously 99% with a contamination of 0.7%. Including parallax and proper motion in the classifier barely changes the results. We further show that not accounting for class priors in the target population leads to serious misclassifications and poor predictions for sample completeness and contamination. (Truncated)

artificial intelligence, machine learning, quasar, (17 more...)

arXiv.org Machine Learning

doi: 10.1111/j.1365-2966.2008.13983.x

0809.3373

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning

Bach, Francis

arXiv.org Machine LearningSep-9-2008

For supervised and unsupervised learning, positive definite kernels allow to use large and potentially infinite dimensional feature spaces with a computational cost that only depends on the number of observations. This is usually done through the penalization of predictor functions by Euclidean or Hilbertian norms. In this paper, we explore penalizing by sparsity-inducing norms such as the l1-norm or the block l1-norm. We assume that the kernel decomposes into a large sum of individual basis kernels which can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a hierarchical multiple kernel learning framework, in polynomial time in the number of selected kernels. This framework is naturally applied to non linear variable selection; our extensive simulations on synthetic datasets and datasets from the UCI repository show that efficiently exploring the large feature space through sparsity-inducing norms leads to state-of-the-art predictive performance.

artificial intelligence, kernel, machine learning, (18 more...)

arXiv.org Machine Learning

0809.1493

Country: Europe > France (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Predictive Hypothesis Identification

Hutter, Marcus

arXiv.org Machine LearningSep-8-2008

While statistics focusses on hypothesis testing and on estimating (properties of) the true sampling distribution, in machine learning the performance of learning algorithms on future data is the primary issue. In this paper we bridge the gap with a general principle (PHI) that identifies hypotheses with best predictive performance. This includes predictive point and interval estimation, simple and composite hypothesis testing, (mixture) model selection, and others as special cases. For concrete instantiations we will recover well-known methods, variations thereof, and new ones. PHI nicely justifies, reconciles, and blends (a reparametrization invariant variation of) MAP, ML, MDL, and moment estimation. One particular feature of PHI is that it can genuinely deal with nested hypotheses.

artificial intelligence, hypothesis, machine learning, (17 more...)

arXiv.org Machine Learning

0809.1270

Country:

Oceania > Australia (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Necessary and Sufficient Conditions for Success of the Nuclear Norm Heuristic for Rank Minimization

Recht, Benjamin, Xu, Weiyu, Hassibi, Babak

arXiv.org Machine LearningSep-7-2008

Minimizing the rank of a matrix subject to constraints is a challenging problem that arises in many applications in control theory, machine learning, and discrete geometry. This class of optimization problems, known as rank minimization, is NP-HARD, and for most practical problems there are no efficient algorithms that yield exact solutions. A popular heuristic algorithm replaces the rank function with the nuclear norm--equal to the sum of the singular values--of the decision variable. In this paper, we provide a necessary and sufficient condition that quantifies when this heuristic successfully finds the minimum rank solution of a linear constraint set. We additionally provide a probability distribution over instances of the affine rank minimization problem such that instances sampled from this distribution satisfy our conditions for success with overwhelming probability provided the number of constraints is appropriately large. Finally, we give empirical evidence that these probabilistic bounds provide accurate predictions of the heuristic's performance in non-asymptotic scenarios.

artificial intelligence, matrix, optimization problem, (17 more...)

arXiv.org Machine Learning

doi: 10.1109/CDC.2008.4739332

0809.1260

Country:

Europe (0.46)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

From Data to the p-Adic or Ultrametric Model

Murtagh, Fionn

arXiv.org Machine LearningSep-2-2008

We model anomaly and change in data by embedding the data in an ultrametric space. Taking our initial data as cross-tabulation counts (or other input data formats), Correspondence Analysis allows us to endow the information space with a Euclidean metric. We then model anomaly or change by an induced ultrametric. The induced ultrametric that we are particularly interested in takes a sequential - e.g. temporal - ordering of the data into account. We apply this work to the flow of narrative expressed in the film script of the Casablanca movie; and to the evolution between 1988 and 2004 of the Colombian social conflict and violence.

artificial intelligence, correspondence analysis, machine learning, (19 more...)

arXiv.org Machine Learning

doi: 10.1134/S2070046609010063

0809.0492

Country:

Africa > Middle East > Morocco > Casablanca-Settat Region > Casablanca (0.26)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.40)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.31)

Add feedback

Uncertainty quantification in complex systems using approximate solvers

Koutsourelakis, Phaedon-Stelios

arXiv.org Machine LearningAug-25-2008

This paper proposes a novel uncertainty quantification framework for computationally demanding systems characterized by a large vector of non-Gaussian uncertainties. It combines state-of-the-art techniques in advanced Monte Carlo sampling with Bayesian formulations. The key departure from existing works is the use of inexpensive, approximate computational models in a rigorous manner. Such models can readily be derived by coarsening the discretization size in the solution of the governing PDEs, increasing the time step when integration of ODEs is performed, using fewer iterations if a non-linear solver is employed or making use of lower order models. It is shown that even in cases where the inexact models provide very poor approximations of the exact response, statistics of the latter can be quantified accurately with significant reductions in the computational effort. Multiple approximate models can be used and rigorous confidence bounds of the estimates produced are provided at all stages.

artificial intelligence, bayesian inference, equation, (17 more...)

arXiv.org Machine Learning

0808.3416

Country: North America > United States > New York (0.14)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Mathematics of Computing (0.67)

Add feedback

An Approximation Ratio for Biclustering

Puolamäki, Kai, Hanhijärvi, Sami, Garriga, Gemma C.

arXiv.org Machine LearningAug-22-2008

The problem of biclustering consists of the simultaneous clustering of rows and columns of a matrix such that each of the submatrices induced by a pair of row and column clusters is as uniform as possible. In this paper we approximate the optimal biclustering by applying one-way clustering algorithms independently on the rows and on the columns of the input matrix. We show that such a solution yields a worst-case approximation ratio of 1+sqrt(2) under L1-norm for 0-1 valued matrices, and of 2 under L2-norm for real valued matrices.

approximation ratio, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

doi: 10.1016/j.ipl.2008.03.013

0712.2682

Country: Europe > Finland (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.71)

Add feedback