AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Why comparing survival curves between two prognostic subgroups may be misleading

arXiv.org Machine LearningNov-4-2016

We consider the validation of prognostic diagnostic tests that predict two prognostic subgroups (high-risk vs low-risk) for a given disease or treatment. When comparing survival curves between two prognostic subgroups the possibility of misclassification arises, i.e. a patient predicted as high-risk might be de facto low-risk and vice versa. This is a fundamental difference from comparing survival curves between two populations (e.g. control vs treatment in RCT), where there is not an option of misclassification between members of populations. We show that there is a relationship between prognostic subgroups' survival estimates at a time point and positive and negative predictive values in the classification settings. Consequently, the prevalence needs to be taken into account when validating the survival of prognostic subgroups at a time point. Our findings question current methods of comparing survival curves between prognostic subgroups in the validation set because they do not take into account the survival rates of the population.

artificial intelligence, machine learning, prognostic subgroup, (15 more...)

arXiv.org Machine Learning

1611.0148

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Class-prior Estimation for Learning from Positive and Unlabeled Data

Plessis, Marthinus C. du, Niu, Gang, Sugiyama, Masashi

arXiv.org Machine LearningNov-4-2016

We consider the problem of estimating the class prior in an unlabeled dataset. Under the assumption that an additional labeled dataset is available, the class prior can be estimated by fitting a mixture of class-wise data distributions to the unlabeled data distribution. However, in practice, such an additional labeled dataset is often not available. In this paper, we show that, with additional samples coming only from the positive class, the class prior of the unlabeled dataset can be estimated correctly. Our key idea is to use properly penalized divergences for model fitting to cancel the error caused by the absence of negative samples. We further show that the use of the penalized $L_1$-distance gives a computationally efficient algorithm with an analytic solution. The consistency, stability, and estimation error are theoretically analyzed. Finally, we experimentally demonstrate the usefulness of the proposed method.

artificial intelligence, machine learning, penl 1, (16 more...)

arXiv.org Machine Learning

doi: 10.1007/s10994-016-5604-6

1611.01586

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Classification with Ultrahigh-Dimensional Features

Li, Yanming, Hong, Hyokyoung, Kang, Jian, He, Kevin, Zhu, Ji, Li, Yi

arXiv.org Machine LearningNov-4-2016

Although much progress has been made in classification with high-dimensional features \citep{Fan_Fan:2008, JGuo:2010, CaiSun:2014, PRXu:2014}, classification with ultrahigh-dimensional features, wherein the features much outnumber the sample size, defies most existing work. This paper introduces a novel and computationally feasible multivariate screening and classification method for ultrahigh-dimensional data. Leveraging inter-feature correlations, the proposed method enables detection of marginally weak and sparse signals and recovery of the true informative feature set, and achieves asymptotic optimal misclassification rates. We also show that the proposed procedure provides more powerful discovery boundaries compared to those in \citet{CaiSun:2014} and \citet{JJin:2009}. The performance of the proposed procedure is evaluated using simulation studies and demonstrated via classification of patients with different post-transplantation renal functional types.

artificial intelligence, classification, machine learning, (15 more...)

arXiv.org Machine Learning

1611.01541

Country: North America > United States > Michigan (0.14)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Illustrated Guide to ROC and AUC

#artificialintelligenceNov-3-2016, 01:30:41 GMT

Think of a regression model mapping a number of features onto a real number (potentially a probability). The resulting real number can then be mapped on one of two classes, depending on whether this predicted number is greater or lower than some choosable threshold. Let's take for example a logistic regression and data on the survivorship of the Titanic accident to introduce the relevant concepts which will lead naturally to the ROC (Receiver Operating Characteristic) and its AUC or AUROC (Area Under ROC Curve). Every record in the data set represents a passenger – providing information on her/his age, gender, class, number of siblings/spouses aboard (sibsp), number of parents/children aboard (parch) and, of course, whether s/he survived the accident. The logistic regression model is tested on batches of 10 cases with a model trained on the remaining N-10 cases – the test batches form a partition of the data. In short, Leave-10-out CV has been applied to arrive at more accurate estimation of the out-of-sample error rates.

artificial intelligence, classifier, machine learning, (17 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Sensitivity Maps of the Hilbert-Schmidt Independence Criterion

Pérez-Suay, Adrián, Camps-Valls, Gustau

arXiv.org Machine LearningNov-2-2016

Kernel dependence measures yield accurate estimates of nonlinear relations between random variables, and they are also endorsed with solid theoretical properties and convergence rates. Besides, the empirical estimates are easy to compute in closed form just involving linear algebra operations. However, they are hampered by two important problems: the high computational cost involved, as two kernel matrices of the sample size have to be computed and stored, and the interpretability of the measure, which remains hidden behind the implicit feature map. We here address these two issues. We introduce the Sensitivity Maps (SMs) for the Hilbert-Schmidt independence criterion (HSIC). Sensitivity maps allow us to explicitly analyze and visualize the relative relevance of both examples and features on the dependence measure. We also present the randomized HSIC (RHSIC) and its corresponding sensitivity maps to cope with large scale problems. We build upon the framework of random features and the Bochner's theorem to approximate the involved kernels in the canonical HSIC. The power of the RHSIC measure scales favourably with the number of samples, and it approximates HSIC and the sensitivity maps efficiently. Convergence bounds of both the measure and the sensitivity map are also provided. Our proposal is illustrated in synthetic examples, and challenging real problems of dependence estimation, feature selection, and causal inference from empirical data.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Machine Learning

1611.00555

Country:

Europe (0.68)
North America > United States (0.46)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback

Dynamic Collaborative Filtering with Compound Poisson Factorization

Jerfel, Ghassen, Basbug, Mehmet E., Engelhardt, Barbara E.

arXiv.org Machine LearningNov-1-2016

Model-based collaborative filtering analyzes user-item interactions to infer latent factors that represent user preferences and item characteristics in order to predict future interactions. Most collaborative filtering algorithms assume that these latent factors are static, although it has been shown that user preferences and item perceptions drift over time. In this paper, we propose a conjugate and numerically stable dynamic matrix factorization (DCPF) based on compound Poisson matrix factorization that models the smoothly drifting latent factors using Gamma-Markov chains. We propose a numerically stable Gamma chain construction, and then present a stochastic variational inference approach to estimate the parameters of our model. We apply our model to time-stamped ratings data sets: Netflix, Yelp, and Last.fm,

artificial intelligence, factorization, machine learning, (17 more...)

arXiv.org Machine Learning

1608.04839

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.68)
Media > Film (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Adaptive Ensemble Learning with Confidence Bounds

Tekin, Cem, Yoon, Jinsung, van der Schaar, Mihaela

arXiv.org Machine LearningOct-30-2016

Extracting actionable intelligence from distributed, heterogeneous, correlated and high-dimensional data sources requires run-time processing and learning both locally and globally. In the last decade, a large number of meta-learning techniques have been proposed in which local learners make online predictions based on their locally-collected data instances, and feed these predictions to an ensemble learner, which fuses them and issues a global prediction. However, most of these works do not provide performance guarantees or, when they do, these guarantees are asymptotic. None of these existing works provide confidence estimates about the issued predictions or rate of learning guarantees for the ensemble learner. In this paper, we provide a systematic ensemble learning method called Hedged Bandits, which comes with both long run (asymptotic) and short run (rate of learning) performance guarantees. Moreover, our approach yields performance guarantees with respect to the optimal local prediction strategy, and is also able to adapt its predictions in a data-driven manner. We illustrate the performance of Hedged Bandits in the context of medical informatics and show that it outperforms numerous online and offline ensemble learning methods.

data mining, machine learning, prediction, (21 more...)

arXiv.org Machine Learning

1512.07446

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)

Add feedback

Compliance monitoring and artificial intelligence

#artificialintelligenceOct-29-2016, 20:30:52 GMT

A recent Compliance Week story on how artificial intelligence could revolutionize compliance depicted how technology firms "are offering software platforms that promise to automate otherwise routine tasks and improve upon fraud detection audits, anti-money laundering protocols, and know-your-customer screening." With the advent of cyber-security attacks, developers of advanced artificial intelligence security monitoring solutions have also emerged. However, understanding when and how often monitoring solutions should be executed presents trade-offs to be considered. Legacy approaches to risk monitoring look for recognized threats by known signatures and pre-built event detection logic. Often these standby methods rest on technology confines and as a result are not aligned to business risk.

artificial intelligence, intelligence, machine learning, (14 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.33)

Add feedback

Stripe: Radar Technical Guide

#artificialintelligenceOct-28-2016, 04:06:33 GMT

Stripe builds products that enable hundreds of thousands of e-commerce companies, SaaS businesses, on-demand marketplaces, nonprofits, and platforms to conduct business online. One inescapable facet of online commerce--and one that, unfortunately, frequently comes as an unpleasant surprise--is fraud. Unlike businesses that accept payments in person, internet businesses are liable for fraudulent purchases--this despite the fact that they are no more experts on fraud than their brick-and-mortar counterparts. As a result, many internet businesses have had to build up teams of fraud analysts and expend engineering effort on fraud detection systems. At Stripe, we want to help businesses focus on their product and customer experiences and not on fraud, so we've developed Stripe Radar, a suite of modern tools for fraud detection and prevention. The goal of this guide is to provide more detail on the machine learning that powers the core of Radar, explain how we think about the efficacy and performance of fraud detection systems, and describe how other tools in the Radar suite can help businesses optimize their outcomes.

artificial intelligence, fraud, machine learning, (11 more...)

#artificialintelligence

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Services > e-Commerce Services (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.44)

Add feedback

Parallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging

Jain, Prateek, Kakade, Sham M., Kidambi, Rahul, Netrapalli, Praneeth, Sidford, Aaron

arXiv.org Machine LearningOct-27-2016

This work characterizes the benefits of averaging techniques widely used in conjunction with stochastic gradient descent (SGD). In particular, this work sharply analyzes: (1) mini-batching, a method of averaging many samples of the gradient to both reduce the variance of a stochastic gradient estimate and for parallelizing SGD and (2) tail-averaging, a method involving averaging the final few iterates of SGD in order to decrease the variance in SGD's final iterate. This work presents the first tight non-asymptotic generalization error bounds for these schemes for the stochastic approximation problem of least squares regression. Furthermore, this work establishes a precise problem-dependent extent to which mini-batching can be used to yield provable near-linear parallelization speedups over SGD with batch size one. These results are utilized in providing a highly parallelizable SGD algorithm that obtains the optimal statistical error rate with nearly the same number of serial updates as batch gradient descent, which improves significantly over existing SGD-style methods. Finally, this work sheds light on some fundamental differences in SGD's behavior when dealing with agnostic noise in the (non-realizable) least squares regression problem. In particular, the work shows that the stepsizes that ensure optimal statistical error rates for the agnostic case must be a function of the noise properties. The central analysis tools used by this paper are obtained through generalizing the operator view of averaged SGD, introduced by Defossez and Bach (2015) followed by developing a novel analysis in bounding these operators to characterize the generalization error. These techniques may be of broader interest in analyzing various computational aspects of stochastic approximation.

artificial intelligence, generalization error, machine learning, (16 more...)

arXiv.org Machine Learning

1610.03774

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback