AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Minimizing The Misclassification Error Rate Using a Surrogate Convex Loss

Ben-David, Shai, Loker, David, Srebro, Nathan, Sridharan, Karthik

arXiv.org Machine LearningJun-27-2012

We carefully study how well minimizing convex surrogate loss functions, corresponds to minimizing the misclassification error rate for the problem of binary classification with linear predictors. In particular, we show that amongst all convex surrogate losses, the hinge loss gives essentially the best possible bound, of all convex loss functions, for the misclassification error rate of the resulting linear predictor in terms of the best possible margin error rate. We also provide lower bounds for specific convex surrogates that show how different commonly used losses qualitatively differ from each other.

artificial intelligence, hinge loss, machine learning, (14 more...)

arXiv.org Machine Learning

1206.6442

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Fast classification using sparse decision DAGs

Benbouzid, Djalel, Busa-Fekete, Robert, Kegl, Balazs

arXiv.org Machine LearningJun-27-2012

In this paper we propose an algorithm that builds sparse decision DAGs (directed acyclic graphs) from a list of base classifiers provided by an external learning method such as AdaBoost. The basic idea is to cast the DAG design task as a Markov decision process. Each instance can decide to use or to skip each base classifier, based on the current state of the classifier being built. The result is a sparse decision DAG where the base classifiers are selected in a data-dependent way. The method has a single hyperparameter with a clear semantics of controlling the accuracy/speed trade-off. The algorithm is competitive with state-of-the-art cascade detectors on three object-detection benchmarks, and it clearly outperforms them when there is a small number of base classifiers. Unlike cascades, it is also readily applicable for multi-class classification. Using the multi-class setup, we show on a benchmark web page ranking data set that we can significantly improve the decision speed without harming the performance of the ranker.

classifier, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

1206.6387

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Feature Selection via Probabilistic Outputs

Danyluk, Andrea, Arnosti, Nicholas

arXiv.org Machine LearningJun-27-2012

This paper investigates two feature-scoring criteria that make use of estimated class probabilities: one method proposed by \citet{shen} and a complementary approach proposed below. We develop a theoretical framework to analyze each criterion and show that both estimate the spread (across all values of a given feature) of the probability that an example belongs to the positive class. Based on our analysis, we predict when each scoring technique will be advantageous over the other and give empirical results validating our predictions.

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Machine Learning

1206.6383

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Demand-Driven Clustering in Relational Domains for Predicting Adverse Drug Events

Davis, Jesse, Costa, Vitor Santos, Peissig, Peggy, Caldwell, Michael, Berg, Elizabeth, Page, David

arXiv.org Artificial IntelligenceJun-27-2012

Learning from electronic medical records (EMR) is challenging due to their relational nature and the uncertain dependence between a patient's past and future health status. Statistical relational learning is a natural fit for analyzing EMRs but is less adept at handling their inherent latent structure, such as connections between related medications or diseases. One way to capture the latent structure is via a relational clustering of objects. We propose a novel approach that, instead of pre-clustering the objects, performs a demand-driven clustering during learning. We evaluate our algorithm on three real-world tasks where the goal is to use EMRs to predict whether a patient will have an adverse reaction to a medication. We find that our approach is more accurate than performing no clustering, pre-clustering, and using expert-constructed medical heterarchies.

artificial intelligence, latent structure, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1206.6399

Country:

Europe (0.93)
North America > United States > Wisconsin (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Predictive Approaches For Gaussian Process Classifier Model Selection

Sellamanickam, Sundararajan, Selvaraj, Sathiya Keerthi

arXiv.org Machine LearningJun-26-2012

In this paper we consider the problem of Gaussian process classifier (GPC) model selection with different Leave-One-Out (LOO) Cross Validation (CV) based optimization criteria and provide a practical algorithm using LOO predictive distributions with such criteria to select hyperparameters. Apart from the standard average negative logarithm of predictive probability (NLP), we also consider smoothed versions of criteria such as F-measure and Weighted Error Rate (WER), which are useful for handling imbalanced data. Unlike the regression case, LOO predictive distributions for the classifier case are intractable. We use approximate LOO predictive distributions arrived from Expectation Propagation (EP) approximation. We conduct experiments on several real world benchmark datasets. When the NLP criterion is used for optimizing the hyperparameters, the predictive approaches show better or comparable NLP generalization performance with existing GPC approaches. On the other hand, when the F-measure criterion is used, the F-measure generalization performance improves significantly on several datasets. Overall, the EP-based predictive algorithm comes out as an excellent choice for GP classifier model selection with different optimization criteria.

artificial intelligence, hyperparameter, machine learning, (18 more...)

arXiv.org Machine Learning

1206.6038

Country: North America (0.14)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Semi-Supervised Learning of Class Balance under Class-Prior Change by Distribution Matching

Plessis, Marthinus Du, Sugiyama, Masashi

arXiv.org Machine LearningJun-18-2012

In real-world classification problems, the class balance in the training dataset does not necessarily reflect that of the test dataset, which can cause significant estimation bias. If the class ratio of the test dataset is known, instance re-weighting or resampling allows systematical bias correction. However, learning the class ratio of the test dataset is challenging when no labeled data is available from the test domain. In this paper, we propose to estimate the class ratio in the test dataset by matching probability distributions of training and test input data. We demonstrate the utility of the proposed approach through experiments.

artificial intelligence, estimator, machine learning, (17 more...)

arXiv.org Machine Learning

1206.4677

Country:

North America > United States (0.48)
Europe > United Kingdom (0.46)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

Learning to Identify Regular Expressions that Describe Email Campaigns

Prasse, Paul, Sawade, Christoph, Landwehr, Niels, Scheffer, Tobias

arXiv.org Machine LearningJun-18-2012

This paper addresses the problem of inferring a regular expression from a given set of strings that resembles, as closely as possible, the regular expression that a human expert would have written to identify the language. This is motivated by our goal of automating the task of postmasters of an email service who use regular expressions to describe and blacklist email spam campaigns. Training data contains batches of messages and corresponding regular expressions that an expert postmaster feels confident to blacklist. We model this task as a learning problem with structured output spaces and an appropriate loss function, derive a decoder and the resulting optimization problem, and a report on a case study conducted with an email service.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Machine Learning

1206.4637

Country: Europe > Germany (0.15)

Genre: Research Report (0.64)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.53)

Add feedback

Machine Learning that Matters

Wagstaff, Kiri

arXiv.org Artificial IntelligenceJun-18-2012

Much of current machine learning (ML) research has lost its connection to problems of import to the larger world of science and society. From this perspective, there exist glaring limitations in the data sets we investigate, the metrics we employ for evaluation, and the degree to which results are communicated back to their originating domains. What changes are needed to how we conduct research to increase the impact that ML has? We present six Impact Challenges to explicitly focus the field's energy and attention, and we discuss existing obstacles that must be addressed. We aim to inspire ongoing discussion and focus on ML that matters.

application, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1206.4656

Country: North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (1.00)
Government (0.69)
Education (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

Add feedback

Multiple Operator-valued Kernel Learning

Kadri, Hachem, Rakotomamonjy, Alain, Bach, Francis, Preux, Philippe

arXiv.org Machine LearningJun-14-2012

Positive definite operator-valued kernels generalize the well-known notion of reproducing kernels, and are naturally adapted to multi-output learning situations. This paper addresses the problem of learning a finite linear combination of infinite-dimensional operator-valued kernels which are suitable for extending functional data analysis methods to nonlinear contexts. We study this problem in the case of kernel ridge regression for functional responses with an lr-norm constraint on the combination coefficients. The resulting optimization problem is more involved than those of multiple scalar-valued kernel learning since operator-valued kernels pose more technical and theoretical issues. We propose a multiple operator-valued kernel learning algorithm based on solving a system of linear operator equations by using a block coordinatedescent procedure. We experimentally validate our approach on a functional regression task in the context of finger movement prediction in brain-computer interfaces.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Machine Learning

1203.1596

Country:

Europe > France (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)

Add feedback

Small Sample Inference for Generalization Error in Classification Using the CUD Bound

Laber, Eric B., Murphy, Susan A.

arXiv.org Machine LearningJun-13-2012

Confidence measures for the generalization error are crucial when small training samples are used to construct classifiers. A common approach is to estimate the generalization error by resampling and then assume the resampled estimator follows a known distribution to form a confidence set [Kohavi 1995, Martin 1996,Yang 2006]. Alternatively, one might bootstrap the resampled estimator of the generalization error to form a confidence set. Unfortunately, these methods do not reliably provide sets of the desired confidence. The poor performance appears to be due to the lack of smoothness of the generalization error as a function of the learned classifier. This results in a non-normal distribution of the estimated generalization error. We construct a confidence set for the generalization error by use of a smooth upper bound on the deviation between the resampled estimate and generalization error. The confidence set is formed by bootstrapping this upper bound. In cases in which the approximation class for the classifier can be represented as a parametric additive model, we provide a computationally efficient algorithm. This method exhibits superior performance across a series of test and simulated data sets.

artificial intelligence, generalization error, machine learning, (16 more...)

arXiv.org Machine Learning

1206.3274

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback