AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Asymptotic normality and optimalities in estimation of large Gaussian graphical models

Ren, Zhao, Sun, Tingni, Zhang, Cun-Hui, Zhou, Harrison H.

arXiv.org Machine LearningJun-3-2015

The Gaussian graphical model, a popular paradigm for studying relationship among variables in a wide range of applications, has attracted great attention in recent years. This paper considers a fundamental question: When is it possible to estimate low-dimensional parameters at parametric square-root rate in a large Gaussian graphical model? A novel regression approach is proposed to obtain asymptotically efficient estimation of each entry of a precision matrix under a sparseness condition relative to the sample size. When the precision matrix is not sufficiently sparse, or equivalently the sample size is not sufficiently large, a lower bound is established to show that it is no longer possible to achieve the parametric rate in the estimation of each entry. This lower bound result, which provides an answer to the delicate sample size question, is established with a novel construction of a subset of sparse precision matrices in an application of Le Cam's lemma. Moreover, the proposed estimator is proven to have optimal convergence rate when the parametric rate cannot be achieved, under a minimal sample requirement. The proposed estimator is applied to test the presence of an edge in the Gaussian graphical model or to recover the support of the entire model, to obtain adaptive rate-optimal estimation of the entire precision matrix as measured by the matrix $\ell_q$ operator norm and to make inference in latent variables in the graphical model. All of this is achieved under a sparsity condition on the precision matrix and a side condition on the range of its spectrum. This significantly relaxes the commonly imposed uniform signal strength condition on the precision matrix, irrepresentability condition on the Hessian tensor operator of the covariance matrix or the $\ell_1$ constraint on the precision matrix. Numerical results confirm our theoretical findings. The ROC curve of the proposed algorithm, Asymptotic Normal Thresholding (ANT), for support recovery significantly outperforms that of the popular GLasso algorithm.

artificial intelligence, estimator, machine learning, (15 more...)

arXiv.org Machine Learning

doi: 10.1214/14-AOS1286

1309.6024

Country: North America > United States > Maryland (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Understanding Random Forests: From Theory to Practice

Louppe, Gilles

arXiv.org Machine LearningJun-3-2015

Data analysis and machine learning have become an integrative part of the modern scientific methodology, offering automated procedures for the prediction of a phenomenon based on past observations, unraveling underlying patterns in data and providing insights about the problem. Yet, caution should avoid using machine learning as a black-box tool, but rather consider it as a methodology, with a rational thought process that is entirely dependent on the problem under study. In particular, the use of algorithms should ideally require a reasonable understanding of their mechanisms, properties and limitations, in order to better apprehend and interpret their results. Accordingly, the goal of this thesis is to provide an in-depth analysis of random forests, consistently calling into question each and every part of the algorithm, in order to shed new light on its learning capabilities, inner workings and interpretability. The first part of this work studies the induction of decision trees and the construction of ensembles of randomized trees, motivating their design and purpose whenever possible. Our contributions follow with an original complexity analysis of random forests, showing their good computational performance and scalability, along with an in-depth discussion of their implementation details, as contributed within Scikit-Learn. In the second part of this work, we analyse and discuss the interpretability of random forests in the eyes of variable importance measures. The core of our contributions rests in the theoretical characterization of the Mean Decrease of Impurity variable importance measure, from which we prove and derive some of its properties in the case of multiway totally randomized trees and in asymptotic conditions. In consequence of this work, our analysis demonstrates that variable importances [...].

artificial intelligence, machine learning, survey article, (21 more...)

arXiv.org Machine Learning

1407.7502

Country:

North America > United States > California (0.45)
North America > United States > Michigan (0.27)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Summary/Review (0.92)
Research Report > Experimental Study (0.92)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (0.92)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.45)

Add feedback

Bootstrap Bias Corrections for Ensemble Methods

Hooker, Giles, Mentch, Lucas

arXiv.org Machine LearningJun-1-2015

This paper examines the use of a residual bootstrap for bias correction in machine learning regression methods. Accounting for bias is an important obstacle in recent efforts to develop statistical inference for machine learning methods. We demonstrate empirically that the proposed bootstrap bias correction can lead to substantial improvements in both bias and predictive accuracy. In the context of ensembles of trees, we show that this correction can be approximated at only double the cost of training the original ensemble without introducing additional variance. Our method is shown to improve test-set accuracy over random forests by up to 70\% on example problems from the UCI repository.

artificial intelligence, correction, machine learning, (15 more...)

arXiv.org Machine Learning

1506.00553

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Learning Relational Event Models from Video

Dubba, Krishna S. R., Cohn, Anthony G., Hogg, David C., Bhatt, Mehul, Dylla, Frank

Journal of Artificial Intelligence ResearchMay-27-2015

Event models obtained automatically from video can be used in applications ranging from abnormal event detection to content based video retrieval. When multiple agents are involved in the events, characterizing events naturally suggests encoding interactions as relations. Learning event models from this kind of relational spatio-temporal data using relational learning techniques such as Inductive Logic Programming (ILP) hold promise, but have not been successfully applied to very large datasets which result from video data. In this paper, we present a novel framework REMIND (Relational Event Model INDuction) for supervised relational learning of event models from large video datasets using ILP. Efficiency is achieved through the learning from interpretations setting and using a typing system that exploits the type hierarchy of objects in a domain. The use of types also helps prevent over generalization. Furthermore, we also present a type-refining operator and prove that it is optimal. The learned models can be used for recognizing events from previously unseen videos. We also present an extension to the framework by integrating an abduction step that improves the learning performance when there is noise in the input data. The experimental results on several hours of video data from two challenging real world domains (an airport domain and a physical action verbs domain) suggest that the techniques are suitable to real world scenarios.

hypothesis, relation, video, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.4395

AI Access Foundation

10940

Journal of Artificial Intelligence Research

Country:

Europe > Germany > Bremen > Bremen (0.14)
North America > United States > Maryland (0.04)
Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)
(2 more...)

Genre: Research Report (0.45)

Industry:

Transportation > Air (0.94)
Transportation > Passenger (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(2 more...)

Add feedback

Optimizing Non-decomposable Performance Measures: A Tale of Two Classes

Narasimhan, Harikrishna, Kar, Purushottam, Jain, Prateek

arXiv.org Machine LearningMay-26-2015

Modern classification problems frequently present mild to severe label imbalance as well as specific requirements on classification characteristics, and require optimizing performance measures that are non-decomposable over the dataset, such as F-measure. Such measures have spurred much interest and pose specific challenges to learning algorithms since their non-additive nature precludes a direct application of well-studied large scale optimization methods such as stochastic gradient descent. In this paper we reveal that for two large families of performance measures that can be expressed as functions of true positive/negative rates, it is indeed possible to implement point stochastic updates. The families we consider are concave and pseudo-linear functions of TPR, TNR which cover several popularly used performance measures such as F-measure, G-mean and H-mean. Our core contribution is an adaptive linearization scheme for these families, using which we develop optimization techniques that enable truly point-based stochastic updates. For concave performance measures we propose SPADE, a stochastic primal dual solver; for pseudo-linear measures we propose STAMP, a stochastic alternate maximization procedure. Both methods have crisp convergence guarantees, demonstrate significant speedups over existing methods - often by an order of magnitude or more, and give similar or more accurate predictions on test data.

artificial intelligence, machine learning, performance measure, (16 more...)

arXiv.org Machine Learning

1505.06812

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Detecting bird sound in unknown acoustic background using crowdsourced training data

Papadopoulos, Timos, Roberts, Stephen, Willis, Kathy

arXiv.org Machine LearningMay-24-2015

Biodiversity monitoring using audio recordings is achievable at a truly global scale via large-scale deployment of inexpensive, unattended recording stations or by large-scale crowdsourcing using recording and species recognition on mobile devices. The ability, however, to reliably identify vocalising animal species is limited by the fact that acoustic signatures of interest in such recordings are typically embedded in a diverse and complex acoustic background. To avoid the problems associated with modelling such backgrounds, we build generative models of bird sounds and use the concept of novelty detection to screen recordings to detect sections of data which are likely bird vocalisations. We present detection results against various acoustic environments and different signal-to-noise ratios. We discuss the issues related to selecting the cost function and setting detection thresholds in such algorithms. Our methods are designed to be scalable and automatically applicable to arbitrary selections of species depending on the specific geographic region and time period of deployment.

background, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1505.06443

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Communications > Social Media > Crowdsourcing (0.70)

Add feedback

Statistical Estimation and Clustering of Group-invariant Orientation Parameters

Chen, Yu-Hui, Wei, Dennis, Newstadt, Gregory, DeGraef, Marc, Simmons, Jeffrey, Hero, Alfred

arXiv.org Machine LearningMay-22-2015

We treat the problem of estimation of orientation parameters whose values are invariant to transformations from a spherical symmetry group. Previous work has shown that any such group-invariant distribution must satisfy a restricted finite mixture representation, which allows the orientation parameter to be estimated using an Expectation Maximization (EM) maximum likelihood (ML) estimation algorithm. In this paper, we introduce two parametric models for this spherical symmetry group estimation problem: 1) the hyperbolic Von Mises Fisher (VMF) mixture distribution and 2) the Watson mixture distribution. We also introduce a new EM-ML algorithm for clustering samples that come from mixtures of group-invariant distributions with different parameters. We apply the models to the problem of mean crystal orientation estimation under the spherically symmetric group associated with the crystal form, e.g., cubic or octahedral or hexahedral. Simulations and experiments establish the advantages of the extended EM-VMF and EM-Watson estimators for data acquired by Electron Backscatter Diffraction (EBSD) microscopy of a polycrystalline Nickel alloy sample.

artificial intelligence, machine learning, orientation, (19 more...)

arXiv.org Machine Learning

1503.04474

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.40)

Industry: Materials (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Vector-Space Markov Random Fields via Exponential Families

Tansey, Wesley, Padilla, Oscar Hernan Madrid, Suggala, Arun Sai, Ravikumar, Pradeep

arXiv.org Machine LearningMay-19-2015

We present Vector-Space Markov Random Fields (VS-MRFs), a novel class of undirected graphical models where each variable can belong to an arbitrary vector space. VS-MRFs generalize a recent line of work on scalar-valued, uni-parameter exponential family and mixed graphical models, thereby greatly broadening the class of exponential families available (e.g., allowing multinomial and Dirichlet distributions). Specifically, VS-MRFs are the joint graphical model distributions where the node-conditional distributions belong to generic exponential families with general vector space domains. We also present a sparsistent $M$-estimator for learning our class of MRFs that recovers the correct set of edges with high probability. We validate our approach via a set of synthetic data experiments as well as a real-world case study of over four million foods from the popular diet tracking app MyFitnessPal. Our results demonstrate that our algorithm performs well empirically and that VS-MRFs are capable of capturing and highlighting interesting structure in complex, real-world data. All code for our algorithm is open source and publicly available.

artificial intelligence, machine learning, node-conditional distribution, (15 more...)

arXiv.org Machine Learning

1505.05117

Country:

Europe (0.46)
North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)

Add feedback

Extraction of Pharmacokinetic Evidence of Drug-drug Interactions from the Literature

Kolchinsky, Artemy, Lourenço, Anália, Wu, Heng-Yi, Li, Lang, Rocha, Luis M.

arXiv.org Machine LearningMay-18-2015

Drug-drug interaction (DDI) is a major cause of morbidity and mortality and a subject of intense scientific interest. Biomedical literature mining can aid DDI research by extracting evidence for large numbers of potential interactions from published literature and clinical databases. Though DDI is investigated in domains ranging in scale from intracellular biochemistry to human populations, literature mining has not been used to extract specific types of experimental evidence, which are reported differently for distinct experimental goals. We focus on pharmacokinetic evidence for DDI, essential for identifying causal mechanisms of putative interactions and as input for further pharmacological and pharmaco-epidemiology investigations. We used manually curated corpora of PubMed abstracts and annotated sentences to evaluate the efficacy of literature mining on two tasks: first, identifying PubMed abstracts containing pharmacokinetic evidence of DDIs; second, extracting sentences containing such evidence from abstracts. We implemented a text mining pipeline and evaluated it using several linear classifiers and a variety of feature transforms. The most important textual features in the abstract and sentence classification tasks were analyzed. We also investigated the performance benefits of using features derived from PubMed metadata fields, various publicly available named entity recognizers, and pharmacokinetic dictionaries. Several classifiers performed very well in distinguishing relevant and irrelevant abstracts (reaching F1~=0.93, MCC~=0.74, iAUC~=0.99) and sentences (F1~=0.76, MCC~=0.65, iAUC~=0.83). We found that word bigram features were important for achieving optimal classifier performance and that features derived from Medical Subject Headings (MeSH) terms significantly improved abstract classification. ...

classifier, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

doi: 10.1371/journal.pone.0122199

1412.0744

Country: North America > United States > Indiana (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

MCODE: Multivariate Conditional Outlier Detection

Hong, Charmgil, Hauskrecht, Milos

arXiv.org Machine LearningMay-15-2015

Outlier detection aims to identify unusual data instances that deviate from expected patterns. The outlier detection is particularly challenging when outliers are context dependent and when they are defined by unusual combinations of multiple outcome variable values. In this paper, we develop and study a new conditional outlier detection approach for multivariate outcome spaces that works by (1) transforming the conditional detection to the outlier detection problem in a new (unconditional) space and (2) defining outlier scores by analyzing the data in the new space. Our approach relies on the classifier chain decomposition of the multi-dimensional classification problem that lets us transform the output space into a probability vector, one probability for each dimension of the output space. Outlier scores applied to these transformed vectors are then used to detect the outliers. Experiments on multiple multi-dimensional classification problems with the different outlier injection rates show that our methodology is robust and able to successfully identify outliers when outliers are either sparse (manifested in one or very few dimensions) or dense (affecting multiple dimensions).

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1505.04097

Country: North America > United States > California (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback