AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Discriminative models for robust image classification

arXiv.org Machine LearningMar-8-2016

A variety of real-world tasks involve the classification of images into pre-determined categories. Designing image classification algorithms that exhibit robustness to acquisition noise and image distortions, particularly when the available training data are insufficient to learn accurate models, is a significant challenge. This dissertation explores the development of discriminative models for robust image classification that exploit underlying signal structure, via probabilistic graphical models and sparse signal representations. Probabilistic graphical models are widely used in many applications to approximate high-dimensional data in a reduced complexity set-up. Learning graphical structures to approximate probability distributions is an area of active research. Recent work has focused on learning graphs in a discriminative manner with the goal of minimizing classification error. In the first part of the dissertation, we develop a discriminative learning framework that exploits the complementary yet correlated information offered by multiple representations (or projections) of a given signal/image. Specifically, we propose a discriminative tree-based scheme for feature fusion by explicitly learning the conditional correlations among such multiple projections in an iterative manner. Experiments reveal the robustness of the resulting graphical model classifier to training insufficiency.

artificial intelligence, classification, machine learning, (18 more...)

arXiv.org Machine Learning

1603.02736

Country:

Europe (1.00)
North America > United States > California (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Education (1.00)
Energy (0.67)
(5 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
(6 more...)

Add feedback

A Kernel Test for Three-Variable Interactions with Random Processes

Rubenstein, Paul K., Chwialkowski, Kacper P., Gretton, Arthur

arXiv.org Machine LearningMar-8-2016

We apply a wild bootstrap method to the Lancaster three-variable interaction measure in order to detect factorisation of the joint distribution on three variables forming a stationary random process, for which the existing permutation bootstrap method fails. As in the i.i.d. case, the Lancaster test is found to outperform existing tests in cases for which two independent variables individually have a weak influence on a third, but that when considered jointly the influence is strong. The main contributions of this paper are twofold: first, we prove that the Lancaster statistic satisfies the conditions required to estimate the quantiles of the null distribution using the wild bootstrap; second, the manner in which this is proved is novel, simpler than existing methods, and can further be applied to other statistics.

artificial intelligence, machine learning, statistic, (18 more...)

arXiv.org Machine Learning

1603.00929

Country: Europe (0.28)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Feature Selection via Binary Simultaneous Perturbation Stochastic Approximation

Aksakalli, Vural, Malekipirbazari, Milad

arXiv.org Machine LearningMar-5-2016

Feature selection (FS) has become an indispensable task in dealing with today's highly complex pattern recognition problems with massive number of features. In this study, we propose a new wrapper approach for FS based on binary simultaneous perturbation stochastic approximation (BSPSA). This pseudo-gradient descent stochastic algorithm starts with an initial feature vector and moves toward the optimal feature vector via successive iterations. In each iteration, the current feature vector's individual components are perturbed simultaneously by random offsets from a qualified probability distribution. We present computational experiments on datasets with numbers of features ranging from a few dozens to thousands using three widely-used classifiers as wrappers: nearest neighbor, decision tree, and linear support vector machine. We compare our methodology against the full set of features as well as a binary genetic algorithm and sequential FS methods using cross-validated classification error rate and AUC as the performance criteria. Our results indicate that features selected by BSPSA compare favorably to alternative methods in general and BSPSA can yield superior feature sets for datasets with tens of thousands of features by examining an extremely small fraction of the solution space. We are not aware of any other wrapper FS methods that are computationally feasible with good convergence properties for such large datasets.

artificial intelligence, dataset, machine learning, (17 more...)

arXiv.org Machine Learning

1508.0763

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Network Unfolding Map by Edge Dynamics Modeling

Verri, Filipe Alves Neto, Urio, Paulo Roberto, Zhao, Liang

arXiv.org Artificial IntelligenceMar-3-2016

The emergence of collective dynamics in neural networks is a mechanism of the animal and human brain for information processing. In this paper, we develop a computational technique of distributed processing elements, which are called particles. We observe the collective dynamics of particles in a complex network for transductive inference on semi-supervised learning problems. Three actions govern the particles' dynamics: walking, absorption, and generation. Labeled vertices generate new particles that compete against rival particles for edge domination. Active particles randomly walk in the network until they are absorbed by either a rival vertex or an edge currently dominated by rival particles. The result from the model simulation consists of sets of edges sorted by the label dominance. Each set tends to form a connected subnetwork to represent a data class. Although the intrinsic dynamics of the model is a stochastic one, we prove there exists a deterministic version with largely reduced computational complexity; specifically, with subquadratic growth. Furthermore, the edge domination process corresponds to an unfolding map. Intuitively, edges "stretch" and "shrink" according to edge dynamics. Consequently, such effect summarizes the relevant relationships between vertices and uncovered data classes. The proposed model captures important details of connectivity patterns over the edge dynamics evolution, which contrasts with previous approaches focused on vertex dynamics. Computer simulations reveal that our model can identify nonlinear features in both real and artificial data, including boundaries between distinct classes and the overlapping structure of data.

artificial intelligence, machine learning, particle, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TNNLS.2016.2626341

1603.01182

Country: North America > United States (0.68)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Optimal approximate matrix product in terms of stable rank

Cohen, Michael B., Nelson, Jelani, Woodruff, David P.

arXiv.org Machine LearningMar-2-2016

We prove, using the subspace embedding guarantee in a black box way, that one can achieve the spectral norm guarantee for approximate matrix multiplication with a dimensionality-reducing map having $m = O(\tilde{r}/\varepsilon^2)$ rows. Here $\tilde{r}$ is the maximum stable rank, i.e. squared ratio of Frobenius and operator norms, of the two matrices being multiplied. This is a quantitative improvement over previous work of [MZ11, KVZ14], and is also optimal for any oblivious dimensionality-reducing map. Furthermore, due to the black box reliance on the subspace embedding property in our proofs, our theorem can be applied to a much more general class of sketching matrices than what was known before, in addition to achieving better bounds. For example, one can apply our theorem to efficient subspace embeddings such as the Subsampled Randomized Hadamard Transform or sparse subspace embeddings, or even with subspace embedding constructions that may be developed in the future. Our main theorem, via connections with spectral error matrix multiplication shown in prior work, implies quantitative improvements for approximate least squares regression and low rank approximation. Our main result has also already been applied to improve dimensionality reduction guarantees for $k$-means clustering [CEMMP14], and implies new results for nonparametric regression [YPW15]. We also separately point out that the proof of the "BSS" deterministic row-sampling result of [BSS12] can be modified to show that for any matrices $A, B$ of stable rank at most $\tilde{r}$, one can achieve the spectral norm guarantee for approximate matrix multiplication of $A^T B$ by deterministically sampling $O(\tilde{r}/\varepsilon^2)$ rows that can be found in polynomial time. The original result of [BSS12] was for rank instead of stable rank. Our observation leads to a stronger version of a main theorem of [KMST10].

artificial intelligence, machine learning, subspace, (18 more...)

arXiv.org Machine Learning

1507.02268

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Projected Estimators for Robust Semi-supervised Classification

Krijthe, Jesse H., Loog, Marco

arXiv.org Machine LearningFeb-25-2016

For semi-supervised techniques to be applied safely in practice we at least want methods to outperform their supervised counterparts. We study this question for classification using the well-known quadratic surrogate loss function. Using a projection of the supervised estimate onto a set of constraints imposed by the unlabeled data, we find we can safely improve over the supervised solution in terms of this quadratic loss. Unlike other approaches to semi-supervised learning, the procedure does not rely on assumptions that are not intrinsic to the classifier at hand. It is theoretically demonstrated that, measured on the labeled and unlabeled training data, this semi-supervised procedure never gives a lower quadratic loss than the supervised alternative. To our knowledge this is the first approach that offers such strong, albeit conservative, guarantees for improvement over the supervised solution. The characteristics of our approach are explicated using benchmark datasets to further understand the similarities and differences between the quadratic loss criterion used in the theoretical results and the classification accuracy often considered in practice.

dataset, procedure, unlabeled data, (12 more...)

arXiv.org Machine Learning

1602.07865

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Kernel Mean Shrinkage Estimators

Muandet, Krikamol, Sriperumbudur, Bharath, Fukumizu, Kenji, Gretton, Arthur, Schölkopf, Bernhard

arXiv.org Machine LearningFeb-25-2016

A mean function in a reproducing kernel Hilbert space (RKHS), or a kernel mean, is central to kernel methods in that it is used by many classical algorithms such as kernel principal component analysis, and it also forms the core inference step of modern kernel methods that rely on embedding probability distributions in RKHSs. Given a finite sample, an empirical average has been used commonly as a standard estimator of the true kernel mean. Despite a widespread use of this estimator, we show that it can be improved thanks to the well-known Stein phenomenon. We propose a new family of estimators called kernel mean shrinkage estimators (KMSEs), which benefit from both theoretical justifications and good empirical performance. The results demonstrate that the proposed estimators outperform the standard one, especially in a "large d, small n" paradigm.

artificial intelligence, estimator, machine learning, (18 more...)

arXiv.org Machine Learning

1405.5505

Country:

North America > United States (1.00)
Europe > United Kingdom > England (0.28)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression

Dieuleveut, Aymeric, Flammarion, Nicolas, Bach, Francis

arXiv.org Machine LearningFeb-24-2016

We consider the optimization of a quadratic objective function whose gradients are only accessible through a stochastic oracle that returns the gradient at any given point plus a zero-mean finite variance random error. We present the first algorithm that achieves jointly the optimal prediction error rates for least-squares regression, both in terms of forgetting of initial conditions in O(1/n 2), and in terms of dependence on the noise and dimension d of the problem, as O(d/n). Our new algorithm is based on averaged accelerated regularized gradient descent, and may also be analyzed through finer assumptions on initial conditions and the Hessian matrix, leading to dimension-free quantities that may still be small while the " optimal " terms above are large. In order to characterize the tightness of these new bounds, we consider an application to non-parametric regression and use the known lower bounds on the statistical performance (without computational limits), which happen to match our bounds obtained from a single pass on the data and thus show optimality of our algorithm in a wide variety of particular trade-offs between bias and variance.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1602.05419

Country: Europe (0.27)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Efficient functional ANOVA through wavelet-domain Markov groves

Ma, Li, Soriano, Jacopo

arXiv.org Machine LearningFeb-20-2016

We introduce a wavelet-domain functional analysis of variance (fANOVA) method based on a Bayesian hierarchical model. The factor effects are modeled through a spike-and-slab mixture at each location-scale combination along with a normal-inverse-Gamma (NIG) conjugate setup for the coefficients and errors. A graphical model called the Markov grove (MG) is designed to jointly model the spike-and-slab statuses at all location-scale combinations, which incorporates the clustering of each factor effect in the wavelet-domain thereby allowing borrowing of strength across location and scale. The posterior of this NIG-MG model is analytically available through a pyramid algorithm of the same computational complexity as Mallat's pyramid algorithm for discrete wavelet transform, i.e., linear in both the number of observations and the number of locations. Posterior probabilities of factor contributions can also be computed through pyramid recursion, and exact samples from the posterior can be drawn without MCMC. We investigate the performance of our method through extensive simulation and show that it outperforms existing wavelet-domain fANOVA methods in a variety of common settings. We apply the method to analyzing the orthosis data.

artificial intelligence, bayesian inference, machine learning, (15 more...)

arXiv.org Machine Learning

1602.0399

Country: Europe (0.45)

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
(3 more...)

Add feedback

Machine Learning Model of the Swift/BAT Trigger Algorithm for Long GRB Population Studies

Graff, Philip B, Lien, Amy Y, Baker, John G, Sakamoto, Takanori

arXiv.org Machine LearningFeb-8-2016

To draw inferences about gamma-ray burst (GRB) source populations based on Swift observations, it is essential to understand the detection efficiency of the Swift burst alert telescope (BAT). This study considers the problem of modeling the Swift/BAT triggering algorithm for long GRBs, a computationally expensive procedure, and models it using machine learning algorithms. A large sample of simulated GRBs from Lien 2014 is used to train various models: random forests, boosted decision trees (with AdaBoost), support vector machines, and artificial neural networks. The best models have accuracies of $\gtrsim97\%$ ($\lesssim 3\%$ error), which is a significant improvement on a cut in GRB flux which has an accuracy of $89.6\%$ ($10.4\%$ error). These models are then used to measure the detection efficiency of Swift as a function of redshift $z$, which is used to perform Bayesian parameter estimation on the GRB rate distribution. We find a local GRB rate density of $n_0 \sim 0.48^{+0.41}_{-0.23} \ {\rm Gpc}^{-3} {\rm yr}^{-1}$ with power-law indices of $n_1 \sim 1.7^{+0.6}_{-0.5}$ and $n_2 \sim -5.9^{+5.7}_{-0.1}$ for GRBs above and below a break point of $z_1 \sim 6.8^{+2.8}_{-3.2}$. This methodology is able to improve upon earlier studies by more accurately modeling Swift detection and using this for fully Bayesian model fitting. The code used in this is analysis is publicly available online (https://github.com/PBGraff/SwiftGRB_PEanalysis).

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.3847/0004-637X/818/1/55

1509.01228

Country: North America > United States > Maryland > Prince George's County (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)
(2 more...)

Add feedback