AITopics

1501.07518

Country: North America > United States (0.68)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
(2 more...)

arXiv.org Machine LearningJan-28-2015

Understanding Kernel Ridge Regression: Common behaviors from simple functions to density functionals

Vu, Kevin, Snyder, John, Li, Li, Rupp, Matthias, Chen, Brandon F., Khelif, Tarek, Müller, Klaus-Robert, Burke, Kieron

Machine learning (ML) is a powerful data-driven method for learning patterns in high-dimensional spaces via induction, and has had widespread success in many fields including more recent applications in quantum chemistry and materials science [1-9]. Here we are interested in applications of ML to construction of density functionals [10-14], which have focused so far on approximating the kinetic energy (KE) of non-interacting electrons. An accurate, general approximation to this could make orbital-free DFT a practical reality. However, ML methods have been developed within the areas of statistics and computer science, and have been applied to a huge variety of data, including neuroscience, image and text processing, and robotics [15]. Thus, they are quite general and have not been tailored to account for specific details of the quantum problem.

artificial intelligence, machine learning, natural language, (20 more...)

1501.03854

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Punjani, Ali, Brubaker, Marcus A.

Microscopic Advances with Large-Scale Learning: Stochastic Optimization for Cryo-EM

arXiv.org Machine LearningJan-27-2015

Determining the 3D structures of biological molecules is a key problem for both biology and medicine. Electron Cryomicroscopy (Cryo-EM) is a promising technique for structure estimation which relies heavily on computational methods to reconstruct 3D structures from 2D images. This paper introduces the challenging Cryo-EM density estimation problem as a novel application for stochastic optimization techniques. Structure discovery is formulated as MAP estimation in a probabilistic latent-variable model, resulting in an optimization problem to which an array of seven stochastic optimization methods are applied. The methods are tested on both real and synthetic data, with some methods recovering reasonable structures in less than one epoch from a random initialization. Complex quasi-Newton methods are found to converge more slowly than simple gradient-based methods, but all stochastic methods are found to converge to similar optima. This method represents a major improvement over existing methods as it is significantly faster and is able to converge from a random initialization.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1501.04656

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Kapoor, Ashish, Frady, E. Paxon, Jegelka, Stefanie, Kristan, William B., Horvitz, Eric

Inferring and Learning from Neuronal Correspondences

arXiv.org Artificial IntelligenceJan-27-2015

We introduce and study methods for inferring and learning from correspondences among neurons. The approach enables alignment of data from distinct multiunit studies of nervous systems. We show that the methods for inferring correspondences combine data effectively from cross-animal studies to make joint inferences about behavioral decision making that are not possible with the data from a single animal. We focus on data collection, machine learning, and prediction in the representative and long-studied invertebrate nervous system of the European medicinal leech. Acknowledging the computational intractability of the general problem of identifying correspondences among neurons, we introduce efficient computational procedures for matching neurons across animals. The methods include techniques that adjust for missing cells or additional cells in the different data sets that may reflect biological or experimental variation.

artificial intelligence, machine learning, neuron, (15 more...)

arXiv.org Artificial Intelligence

1501.05973

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Wiener, Yair, El-Yaniv, Ran

Agnostic Pointwise-Competitive Selective Classification

Journal of Artificial Intelligence ResearchJan-26-2015

Pointwise-competitive classifier from class F is required to classify identically to the best classifier in hindsight from F. For noisy, agnostic settings we present a strategy for learning pointwise-competitive classifiers from a finite training sample provided that the classifier can abstain from prediction at a certain region of its choice. For some interesting hypothesis classes and families of distributions, the measure of this rejected region is shown to be diminishing at a fast rate, with high probability. Exact implementation of the proposed learning strategy is dependent on an ERM oracle that can be hard to compute in the agnostic case. We thus consider a heuristic approximation procedure that is based on SVMs, and show empirically that this algorithm consistently outperforms a traditional rejection mechanism based on distance from decision boundary.

classification, classifier, probability, (13 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4439

AI Access Foundation

10927

Journal of Artificial Intelligence Research

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)
(5 more...)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Optimal computational and statistical rates of convergence for sparse nonconvex learning problems

Wang, Zhaoran, Liu, Han, Zhang, Tong

We provide theoretical analysis of the statistical and computational properties of penalized $M$-estimators that can be formulated as the solution to a possibly nonconvex optimization problem. Many important estimators fall in this category, including least squares regression with nonconvex regularization, generalized linear models with nonconvex regularization and sparse elliptical random design regression. For these problems, it is intractable to calculate the global solution due to the nonconvex formulation. In this paper, we propose an approximate regularization path-following method for solving a variety of learning problems with nonconvex objective functions. Under a unified analytic framework, we simultaneously provide explicit statistical and computational rates of convergence for any local solution attained by the algorithm. Computationally, our algorithm attains a global geometric rate of convergence for calculating the full regularization path, which is optimal among all first-order algorithms. Unlike most existing methods that only attain geometric rates of convergence for one single regularization parameter, our algorithm calculates the full regularization path with the same iteration complexity. In particular, we provide a refined iteration complexity bound to sharply characterize the performance of each stage along the regularization path. Statistically, we provide sharp sample complexity analysis for all the approximate local solutions along the regularization path. In particular, our analysis improves upon existing results by providing a more refined sample complexity bound as well as an exact support recovery result for the final estimator. These results show that the final estimator attains an oracle statistical property due to the usage of nonconvex penalty.

assumption 4, convergence, probability, (16 more...)

doi: 10.1214/14-AOS1238

1306.496

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (0.65)

Industry: Education > Focused Education > Special Education (0.60)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Maddison, Chris J., Tarlow, Daniel, Minka, Tom

A* Sampling

The problem of drawing samples from a discrete distribution can be converted into a discrete optimization problem. In this work, we show how sampling from a continuous distribution can be converted into an optimization problem over continuous space. Central to the method is a stochastic process recently described in mathematical statistics that we call the Gumbel process. We present a new construction of the Gumbel process and A* sampling, a practical generic sampling algorithm that searches for the maximum of a Gumbel process using A* search. We analyze the correctness and convergence time of A* sampling and demonstrate empirically that it makes more efficient use of bound and likelihood evaluations than the most closely related adaptive rejection sampling-based algorithms.

artificial intelligence, exp, machine learning, (20 more...)

1411.003

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Rakhlin, Alexander, Sridharan, Karthik

Online Nonparametric Regression with General Loss Functions

This paper establishes minimax rates for online regression with arbitrary classes of functions and general losses. We show that below a certain threshold for the complexity of the function class, the minimax rates depend on both the curvature of the loss function and the sequential complexities of the class. Above this threshold, the curvature of the loss does not affect the rates. Furthermore, for the case of square loss, our results point to the interesting phenomenon: whenever sequential and i.i.d. empirical entropies match, the rates for statistical and online learning are the same. In addition to the study of minimax regret, we derive a generic forecaster that enjoys the established optimal rates. We also provide a recipe for designing online prediction algorithms that can be computationally efficient for certain problems. We illustrate the techniques by deriving existing and new forecasters for the case of finite experts and for online linear regression.

artificial intelligence, complexity, machine learning, (19 more...)

1501.06598

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

High Dimensional Expectation-Maximization Algorithm: Statistical Optimization and Asymptotic Normality

Wang, Zhaoran, Gu, Quanquan, Ning, Yang, Liu, Han

We provide a general theory of the expectation-maximization (EM) algorithm for inferring high dimensional latent variable models. In particular, we make two contributions: (i) For parameter estimation, we propose a novel high dimensional EM algorithm which naturally incorporates sparsity structure into parameter estimation. With an appropriate initialization, this algorithm converges at a geometric rate and attains an estimator with the (near-)optimal statistical rate of convergence. (ii) Based on the obtained estimator, we propose new inferential procedures for testing hypotheses and constructing confidence intervals for low dimensional components of high dimensional parameters. For a broad family of statistical models, our framework establishes the first computationally feasible approach for optimal estimation and asymptotic inference in high dimensions. Our theory is supported by thorough numerical results.

artificial intelligence, inequality, machine learning, (16 more...)

1412.8729

Country: North America (0.28)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Szabo, Zoltan, Gretton, Arthur, Poczos, Barnabas, Sriperumbudur, Bharath

Two-stage Sampled Learning Theory on Distributions

We focus on the distribution regression problem: regressing to a real-valued response from a probability distribution. Although there exist a large number of similarity measures between distributions, very little is known about their generalization performance in specific learning tasks. Learning problems formulated on distributions have an inherent two-stage sampled difficulty: in practice only samples from sampled distributions are observable, and one has to build an estimate on similarities computed between sets of points. To the best of our knowledge, the only existing method with consistency guarantees for distribution regression requires kernel density estimation as an intermediate step (which suffers from slow convergence issues in high dimensions), and the domain of the distributions to be compact Euclidean. In this paper, we provide theoretical guarantees for a remarkably simple algorithmic alternative to solve the distribution regression problem: embed the distributions to a reproducing kernel Hilbert space, and learn a ridge regressor from the embeddings to the outputs. Our main contribution is to prove the consistency of this technique in the two-stage sampled setting under mild conditions (on separable, topological domains endowed with kernels). For a given total number of observations, we derive convergence rates as an explicit function of the problem difficulty. As a special case, we answer a 15-year-old open question: we establish the consistency of the classical set kernel [Haussler, 1999; Gärtner et.

artificial intelligence, kernel, machine learning, (18 more...)

1402.1754

Country:

North America > United States > California (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)