AITopics | arXiv.org Machine Learning

Plotting

arXiv.org Machine Learning

Comment on "Fastest learning in small-world neural networks"

arXiv.org Machine LearningFeb-26-2010

This comment reexamines Simard et al.'s work in [D. Simard, L. Nadeau, H. Kroger, Phys. Lett. A 336 (2005) 8-15]. We found that Simard et al. calculated mistakenly the local connectivity lengths Dlocal of networks. The right results of Dlocal are presented and the supervised learning performance of feedforward neural networks (FNNs) with different rewirings are re-investigated in this comment. This comment discredits Simard et al's work by two conclusions: 1) Rewiring connections of FNNs cannot generate networks with small-world connectivity; 2) For different training sets, there do not exist networks with a certain number of rewirings generating reduced learning errors than networks with other numbers of rewiring.

artificial intelligence, connectivity, neural network, (18 more...)

arXiv.org Machine Learning

1003.006

Country: Asia > China (0.16)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Quasi-Newton Approach to Nonsmooth Convex Optimization Problems in Machine Learning

Yu, Jin, Vishwanathan, S. V. N., Guenter, Simon, Schraudolph, Nicol N.

arXiv.org Machine LearningFeb-22-2010

We extend the well-known BFGS quasi-Newton method and its memory-limited variant LBFGS to the optimization of nonsmooth convex objectives. This is done in a rigorous fashion by generalizing three components of BFGS to subdifferentials: the local quadratic model, the identification of a descent direction, and the Wolfe line search conditions. We prove that under some technical conditions, the resulting subBFGS algorithm is globally convergent in objective function value. We apply its memory-limited variant (subLBFGS) to L_2-regularized risk minimization with the binary hinge loss. To extend our algorithm to the multiclass and multilabel settings, we develop a new, efficient, exact line search algorithm. We prove its worst-case time complexity bounds, and show that our line search can also be used to extend a recently developed bundle method to the multiclass and multilabel settings. We also apply the direction-finding component of our algorithm to L_1-regularized risk minimization with logistic loss. In all these contexts our methods perform comparable to or better than specialized state-of-the-art solvers on a number of publicly available datasets. An open source implementation of our algorithms is freely available.

artificial intelligence, line search, optimization problem, (18 more...)

arXiv.org Machine Learning

0804.3835

Country:

Oceania > Australia (0.28)
North America > United States > Indiana > Tippecanoe County (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.51)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)

Add feedback

Query Learning with Exponential Query Costs

Bellala, Gowtham, Bhavnani, Suresh, Scott, Clayton

arXiv.org Machine LearningFeb-21-2010

In query learning, the goal is to identify an unknown object while minimizing the number of "yes" or "no" questions (queries) posed about that object. A well-studied algorithm for query learning is known as generalized binary search (GBS). We show that GBS is a greedy algorithm to optimize the expected number of queries needed to identify the unknown object. We also generalize GBS in two ways. First, we consider the case where the cost of querying grows exponentially in the number of queries and the goal is to minimize the expected exponential cost. Then, we consider the case where the objects are partitioned into groups, and the objective is to identify only the group to which the object belongs. We derive algorithms to address these issues in a common, information-theoretic framework. In particular, we present an exact formula for the objective function in each case involving Shannon or Renyi entropy, and develop a greedy algorithm for minimizing it. Our algorithms are demonstrated on two applications of query learning, active learning and emergency response.

health & medicine, node, optimization problem, (19 more...)

arXiv.org Machine Learning

1002.4019

Country: North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)

Add feedback

Plugin procedure in segmentation and application to hyperspectral image segmentation

Girard, R.

arXiv.org Machine LearningFeb-19-2010

In this article we give our contribution to the problem of segmentation with plug-in procedures. We give general sufficient conditions under which plug in procedure are efficient. We also give an algorithm that satisfy these conditions. We give an application of the used algorithm to hyperspectral images segmentation. Hyperspectral images are images that have both spatial and spectral coherence with thousands of spectral bands on each pixel. In the proposed procedure we combine a reduction dimension technique and a spatial regularisation technique. This regularisation is based on the mixlet modelisation of Kolaczyck and Al.

artificial intelligence, health & medicine, procedure, (18 more...)

arXiv.org Machine Learning

1002.3744

Genre: Research Report (1.00)

Industry: Health & Medicine (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

Operator norm convergence of spectral clustering on level sets

Pelletier, Bruno, Pudlo, Pierre

arXiv.org Machine LearningFeb-11-2010

Following Hartigan, a cluster is defined as a connected component of the t-level set of the underlying density, i.e., the set of points for which the density is greater than t. A clustering algorithm which combines a density estimate with spectral clustering techniques is proposed. Our algorithm is composed of two steps. First, a nonparametric density estimate is used to extract the data points for which the estimated density takes a value greater than t. Next, the extracted points are clustered based on the eigenvectors of a graph Laplacian matrix. Under mild assumptions, we prove the almost sure convergence in operator norm of the empirical graph Laplacian operator associated with the algorithm. Furthermore, we give the typical behavior of the representation of the dataset into the feature space, which establishes the strong consistency of our proposed algorithm.

artificial intelligence, convergence, machine learning, (17 more...)

arXiv.org Machine Learning

1002.2313

Country:

Europe (0.93)
North America > United States (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Manifold-Based Signal Recovery and Parameter Estimation from Compressive Measurements

Wakin, Michael B.

arXiv.org Machine LearningFeb-5-2010

A field known as Compressive Sensing (CS) has recently emerged to help address the growing challenges of capturing and processing high-dimensional signals and data sets. CS exploits the surprising fact that the information contained in a sparse signal can be preserved in a small number of compressive (or random) linear measurements of that signal. Strong theoretical guarantees have been established on the accuracy to which sparse or near-sparse signals can be recovered from noisy compressive measurements. In this paper, we address similar questions in the context of a different modeling framework. Instead of sparse models, we focus on the broad class of manifold models, which can arise in both parametric and non-parametric signal families. Building upon recent results concerning the stable embeddings of manifolds within the measurement space, we establish both deterministic and probabilistic instance-optimal bounds in $\ell_2$ for manifold-based signal recovery and parameter estimation from noisy compressive measurements. In line with analogous results for sparsity-based CS, we conclude that much stronger bounds are possible in the probabilistic setting. Our work supports the growing empirical evidence that manifold-based models can be used with high accuracy in compressive signal processing.

artificial intelligence, machine learning, manifold, (17 more...)

arXiv.org Machine Learning

1002.1247

Country:

North America > United States > Texas (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

K-Dimensional Coding Schemes in Hilbert Spaces

Pontil, Andreas Maurer Massimiliano

arXiv.org Machine LearningFeb-3-2010

This paper presents a general coding method where data in a Hilbert space are represented by finite dimensional coding vectors. The method is based on empirical risk minimization within a certain class of linear operators, which map the set of coding vectors to the Hilbert space. Two results bounding the expected reconstruction error of the method are derived, which highlight the role played by the codebook and the class of linear operators. The results are specialized to some cases of practical importance, including K-means clustering, nonnegative matrix factorization and other sparse coding methods.

artificial intelligence, machine learning, unit ball, (16 more...)

arXiv.org Machine Learning

1002.0832

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)

Add feedback

Confidence Sets Based on Penalized Maximum Likelihood Estimators in Gaussian Regression

Pötscher, Benedikt M., Schneider, Ulrike

arXiv.org Machine LearningFeb-1-2010

Confidence intervals based on penalized maximum likelihood estimators such as the LASSO, adaptive LASSO, and hard-thresholding are analyzed. In the known-variance case, the finite-sample coverage properties of such intervals are determined and it is shown that symmetric intervals are the shortest. The length of the shortest intervals based on the hard-thresholding estimator is larger than the length of the shortest interval based on the adaptive LASSO, which is larger than the length of the shortest interval based on the LASSO, which in turn is larger than the standard interval based on the maximum likelihood estimator. In the case where the penalized estimators are tuned to possess the `sparsity property', the intervals based on these estimators are larger than the standard interval by an order of magnitude. Furthermore, a simple asymptotic confidence interval construction in the `sparse' case, that also applies to the smoothly clipped absolute deviation estimator, is discussed. The results for the known-variance case are shown to carry over to the unknown-variance case in an appropriate asymptotic sense.

artificial intelligence, bayesian inference, estimator, (16 more...)

arXiv.org Machine Learning

doi: 10.1214/09-EJS523

0806.1652

Country: Europe > Germany > Lower Saxony > Gottingen (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.82)

Add feedback

Classifying the typefaces of the Gutenberg 42-line bible

Alabert, Aureli, Rangel, Luz Ma.

arXiv.org Machine LearningJan-31-2010

We have measured the dissimilarities among several printed characters of a single page in the Gutenberg 42-line bible and we prove statistically the existence of several different matrices from which the metal types where constructed. This is in contrast with the prevailing theory, which states that only one matrix per character was used in the printing process of Gutenberg's greatest work. The main mathematical tool for this purpose is cluster analysis, combined with a statistical test for outliers. We carry out the research with two letters, i and a. In the first case, an exact clustering method is employed; in the second, with more specimens to be classified, we resort to an approximate agglomerative clustering method. The results show that the letters form clusters according to their shape, with significant shape differences among clusters, and allow to conclude, with a very small probability of error, that indeed the metal types used to print them were cast from several different matrices. Mathematics Subject Classification: 62H30

artificial intelligence, dissimilarity, machine learning, (17 more...)

arXiv.org Machine Learning

1002.014

Country:

North America > United States (0.14)
Europe > Spain > Catalonia (0.14)
Europe > Germany (0.14)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Hilbert space embeddings and metrics on probability measures

Sriperumbudur, Bharath K., Gretton, Arthur, Fukumizu, Kenji, Schölkopf, Bernhard, Lanckriet, Gert R. G.

arXiv.org Machine LearningJan-29-2010

A Hilbert space embedding for probability measures has recently been proposed, with applications including dimensionality reduction, homogeneity testing, and independence testing. This embedding represents any probability measure as a mean element in a reproducing kernel Hilbert space (RKHS). A pseudometric on the space of probability measures can be defined as the distance between distribution embeddings: we denote this as $\gamma_k$, indexed by the kernel function $k$ that defines the inner product in the RKHS. We present three theoretical properties of $\gamma_k$. First, we consider the question of determining the conditions on the kernel $k$ for which $\gamma_k$ is a metric: such $k$ are denoted {\em characteristic kernels}. Unlike pseudometrics, a metric is zero only when two distributions coincide, thus ensuring the RKHS embedding maps all distributions uniquely (i.e., the embedding is injective). While previously published conditions may apply only in restricted circumstances (e.g. on compact domains), and are difficult to check, our conditions are straightforward and intuitive: bounded continuous strictly positive definite kernels are characteristic. Alternatively, if a bounded continuous kernel is translation-invariant on $\bb{R}^d$, then it is characteristic if and only if the support of its Fourier transform is the entire $\bb{R}^d$. Second, we show that there exist distinct distributions that are arbitrarily close in $\gamma_k$. Third, to understand the nature of the topology induced by $\gamma_k$, we relate $\gamma_k$ to other popular metrics on probability measures, and present conditions on the kernel $k$ under which $\gamma_k$ metrizes the weak topology.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Machine Learning

0907.5309

Country:

North America > United States > California > San Diego County (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback