AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Cost-Sensitive Support Vector Machines

Masnadi-Shirazi, Hamed, Vasconcelos, Nuno, Iranmehr, Arya

arXiv.org Machine LearningFeb-15-2015

A new procedure for learning cost-sensitive SVM(CS-SVM) classifiers is proposed. The SVM hinge loss is extended to the cost sensitive setting, and the CS-SVM is derived as the minimizer of the associated risk. The extension of the hinge loss draws on recent connections between risk minimization and probability elicitation. These connections are generalized to cost-sensitive classification, in a manner that guarantees consistency with the cost-sensitive Bayes risk, and associated Bayes decision rule. This ensures that optimal decision rules, under the new hinge loss, implement the Bayes-optimal cost-sensitive classification boundary. Minimization of the new hinge loss is shown to be a generalization of the classic SVM optimization problem, and can be solved by identical procedures. The dual problem of CS-SVM is carefully scrutinized by means of regularization theory and sensitivity analysis and the CS-SVM algorithm is substantiated. The proposed algorithm is also extended to cost-sensitive learning with example dependent costs. The minimum cost sensitive risk is proposed as the performance measure and is connected to ROC analysis through vector optimization. The resulting algorithm avoids the shortcomings of previous approaches to cost-sensitive SVM design, and is shown to have superior experimental performance on a large number of cost sensitive and imbalanced datasets.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Machine Learning

1212.0975

Country: North America > United States > California > San Francisco County > San Francisco (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

ggRandomForests: Visually Exploring a Random Forest for Regression

Ehrlinger, John

arXiv.org Machine LearningFeb-13-2015

Random Forests [Breiman:2001] (RF) are a fully non-parametric statistical method requiring no distributional assumptions on covariate relation to the response. RF are a robust, nonlinear technique that optimizes predictive accuracy by fitting an ensemble of trees to stabilize model estimates. The randomForestSRC package (http://cran.r-project.org/package=randomForestSRC) is a unified treatment of Breiman's random forests for survival, regression and classification problems. Predictive accuracy make RF an attractive alternative to parametric models, though complexity and interpretability of the forest hinder wider application of the method. We introduce the ggRandomForests package (http://cran.r-project.org/package=ggRandomForests), for visually understand random forest models grown in R with the randomForestSRC package. The vignette is a tutorial for using the ggRandomForests package with the randomForestSRC package for building and post-processing a regression random forest. In this tutorial, we explore a random forest model for the Boston Housing Data, available in the MASS package. We grow a random forest for regression and demonstrate how ggRandomForests can be used when determining variable associations, interactions and how the response depends on predictive variables within the model. The tutorial demonstrates the design and usage of many of ggRandomForests functions and features how to modify and customize the resulting ggplot2 graphic objects along the way. A development version of the ggRandomForests package is available on Github. We invite comments, feature requests and bug reports for this package at (https://github.com/ehrlinger/ggRandomForests).

artificial intelligence, machine learning, random forest, (15 more...)

arXiv.org Machine Learning

1501.07196

Country: North America > United States > California (0.46)

Genre: Research Report > Experimental Study (0.46)

Industry: Banking & Finance > Real Estate (0.37)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.54)

Add feedback

Massively Multitask Networks for Drug Discovery

Ramsundar, Bharath, Kearnes, Steven, Riley, Patrick, Webster, Dale, Konerding, David, Pande, Vijay

arXiv.org Machine LearningFeb-6-2015

Massively multitask neural architectures provide a learning framework for drug discovery that synthesizes information from many distinct biological sources. To train these architectures at scale, we gather large amounts of data from public sources to create a dataset of nearly 40 million measurements across more than 200 biological targets. We investigate several aspects of the multitask framework by performing a series of empirical studies and obtain some interesting results: (1) massively multitask networks obtain predictive accuracies significantly better than single-task methods, (2) the predictive power of multitask networks improves as additional tasks and data are added, (3) the total amount of data and the total number of tasks both contribute significantly to multitask improvement, and (4) multitask networks afford limited transferability to tasks not in the training set. Our results underscore the need for greater data sharing and further algorithmic innovation to accelerate the drug discovery process.

artificial intelligence, dataset, machine learning, (12 more...)

arXiv.org Machine Learning

1502.02072

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Generalized Dantzig Selector: Application to the k-support norm

Chatterjee, Soumyadeep, Chen, Sheng, Banerjee, Arindam

arXiv.org Machine LearningFeb-2-2015

We propose a Generalized Dantzig Selector (GDS) for linear models, in which any norm encoding the parameter structure can be leveraged for estimation. We investigate both computational and statistical aspects of the GDS. Based on conjugate proximal operator, a flexible inexact ADMM framework is designed for solving GDS, and non-asymptotic high-probability bounds are established on the estimation error, which rely on Gaussian width of unit norm ball and suitable set encompassing estimation error. Further, we consider a non-trivial example of the GDS using $k$-support norm. We derive an efficient method to compute the proximal operator for $k$-support norm since existing methods are inapplicable in this setting. For statistical analysis, we provide upper bounds for the Gaussian widths needed in the GDS analysis, yielding the first statistical recovery guarantee for estimation with the $k$-support norm. The experimental results confirm our theoretical analysis.

artificial intelligence, k-support norm, machine learning, (14 more...)

arXiv.org Machine Learning

1406.5291

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Cascading Randomized Weighted Majority: A New Online Ensemble Learning Algorithm

Zamani, Mohammadzaman, Beigy, Hamid, Shaban, Amirreza

arXiv.org Machine LearningFeb-2-2015

With the increasing volume of data in the world, the best approach for learning from this data is to exploit an online learning algorithm. Online ensemble methods are online algorithms which take advantage of an ensemble of classifiers to predict labels of data. Prediction with expert advice is a well-studied problem in the online ensemble learning literature. The Weighted Majority algorithm and the randomized weighted majority (RWM) are the most well-known solutions to this problem, aiming to converge to the best expert. Since among some expert, The best one does not necessarily have the minimum error in all regions of data space, defining specific regions and converging to the best expert in each of these regions will lead to a better result. In this paper, we aim to resolve this defect of RWM algorithms by proposing a novel online ensemble algorithm to the problem of prediction with expert advice. We propose a cascading version of RWM to achieve not only better experimental results but also a better error bound for sufficiently large datasets.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Machine Learning

1403.0388

Country: Asia > Middle East (0.28)

Genre:

Research Report > New Finding (0.68)
Instructional Material > Online (0.61)

Industry:

Education > Educational Setting > Online (0.48)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Feature Selection with Redundancy-complementariness Dispersion

Chen, Zhijun, Wu, Chaozhong, Zhang, Yishi, Huang, Zhen, Ran, Bin, Zhong, Ming, Lyu, Nengchao

arXiv.org Machine LearningFeb-1-2015

Feature selection has attracted significant attention in data mining and machine learning in the past decades. Many existing feature selection methods eliminate redundancy by measuring pairwise inter-correlation of features, whereas the complementariness of features and higher inter-correlation among more than two features are ignored. In this study, a modification item concerning the complementariness of features is introduced in the evaluation criterion of features. Additionally, in order to identify the interference effect of already-selected False Positives (FPs), the redundancy-complementariness dispersion is also taken into account to adjust the measurement of pairwise inter-correlation of features. To illustrate the effectiveness of proposed method, classification experiments are applied with four frequently used classifiers on ten datasets. Classification results verify the superiority of proposed method compared with five representative feature selection methods. Keywords: Classification, Feature selection, Relevance, Redundancy, Complementariness, Redundancy-complementariness dispersion 1. Introduction With the fast development of the world, the dimensional and size of data is fast-growing in most kinds of fields which challenge the data mining and machine learning techniques. Feature selection is an important and useful method that can effectively reduce the dimensionality of feature space while retaining a relatively high accuracy in representing the original data. The effects of feature selection [9] have been widely recognized for its abilities in facilitating data interpretation, reducing acquisition and storage requirements, increasing learning speeds, improving generalization performance, etc.

artificial intelligence, correlation, machine learning, (17 more...)

arXiv.org Machine Learning

1502.00231

Country:

North America > United States > California (0.68)
North America > United States > Wisconsin > Dane County > Madison (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.93)
Health & Medicine > Therapeutic Area (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)

Add feedback

A New Intelligence Based Approach for Computer-Aided Diagnosis of Dengue Fever

Rao, Vadrevu Sree Hari, Kumar, Mallenahalli Naresh

arXiv.org Artificial IntelligenceJan-30-2015

Identification of the influential clinical symptoms and laboratory features that help in the diagnosis of dengue fever in early phase of the illness would aid in designing effective public health management and virological surveillance strategies. Keeping this as our main objective we develop in this paper, a new computational intelligence based methodology that predicts the diagnosis in real time, minimizing the number of false positives and false negatives. Our methodology consists of three major components (i) a novel missing value imputation procedure that can be applied on any data set consisting of categorical (nominal) and/or numeric (real or integer) (ii) a wrapper based features selection method with genetic search for extracting a subset of most influential symptoms that can diagnose the illness and (iii) an alternating decision tree method that employs boosting for generating highly accurate decision rules. The predictive models developed using our methodology are found to be more accurate than the state-of-the-art methodologies used in the diagnosis of the dengue fever.

artificial intelligence, diagnosis, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TITB.2011.2171978

1502.00062

Country:

North America > United States (0.46)
Asia > India (0.28)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

High-Dimensional Longitudinal Classification with the Multinomial Fused Lasso

Adhikari, Samrachana, Lecci, Fabrizio, Becker, James T., Junker, Brian W., Kuller, Lewis H., Lopez, Oscar L., Tibshirani, Ryan J.

arXiv.org Machine LearningJan-29-2015

We study regularized estimation in high-dimensional longitudinal classification problems, using the lasso and fused lasso regularizers. The constructed coefficient estimates are piecewise constant across the time dimension in the longitudinal problem, with adaptively selected change points (break points). We present an efficient algorithm for computing such estimates, based on proximal gradient descent. We apply our proposed technique to a longitudinal data set on Alzheimer's disease from the Cardiovascular Health Study Cognition Study, and use this data set to motivate and demonstrate several practical considerations such as the selection of tuning parameters, and the assessment of model stability.

artificial intelligence, coefficient, machine learning, (17 more...)

arXiv.org Machine Learning

1501.07518

Country: North America > United States (0.68)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
(2 more...)

Add feedback

Understanding Kernel Ridge Regression: Common behaviors from simple functions to density functionals

Vu, Kevin, Snyder, John, Li, Li, Rupp, Matthias, Chen, Brandon F., Khelif, Tarek, Müller, Klaus-Robert, Burke, Kieron

arXiv.org Machine LearningJan-28-2015

Machine learning (ML) is a powerful data-driven method for learning patterns in high-dimensional spaces via induction, and has had widespread success in many fields including more recent applications in quantum chemistry and materials science [1-9]. Here we are interested in applications of ML to construction of density functionals [10-14], which have focused so far on approximating the kinetic energy (KE) of non-interacting electrons. An accurate, general approximation to this could make orbital-free DFT a practical reality. However, ML methods have been developed within the areas of statistics and computer science, and have been applied to a huge variety of data, including neuroscience, image and text processing, and robotics [15]. Thus, they are quite general and have not been tailored to account for specific details of the quantum problem.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Machine Learning

1501.03854

Country:

Europe (0.68)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.40)

Add feedback

MACHINE INTELLIGENCE 13

AI ClassicsJan-25-2015, 22:20:22 GMT

The two outstanding figures in the history of computer science are Alan Turing and John von Neumann, and they shared the view that logic was the key to understanding and automating computation. In particular, it was Turing who gave us in the mid-1930s the fundamental analysis, and the logical definition, of the concept of'computability by machine' and who discovered the surprising and beautiful basic fact that there exist universal machines which by suitable programming can be made to t This essay is an expanded and revised version of one entitled The Role of Logic in Computer Science and Artificial Intelligence, which was completed in January 1992 (and was later published in the Proceedings of the Fifth Generation computer Systems 1992 Conference). Since completing that essay I have had the benefit of extremely helpful discussions on many of the details with Professor Donald Michie and Professor I. J. Good, both of whom knew Turing well during the war years at Bletchley Park. Professor J. A. N. Lee, whose knowledge of the literature and archives of the history of computing is encyclopedic, also provided additional information, some of which is still unpublished. Further light has very recently been shed on the von Neumann side of the story by Norman Macrae's excellent biography John von Neumann (Macrae 1992). Accordingly, it seemed appropriate to undertake a more complete and thorough version of the FGCS'92 essay, focussing somewhat more on the interesting historical and biographical issues. I am grateful to Donald Michie and Stephen Muggleton for inviting me to contribute such a'second edition' to the present volume, and I would also like to thank the Institute for New Computer Technology (ICOT) for kind permission to make use of the FGCS'92 essay in this way. 1 LOGIC, COMPUTERS, TURING, AND VON NEUMANN

canada government, hitachi, ltd., university of pittsburgh, (83 more...)

AI Classics

Country:

Asia (1.00)
Europe > Germany (0.92)
North America > Canada (0.92)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
(3 more...)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Transportation > Air (1.00)
(11 more...)

Technology:

Information Technology > Artificial Intelligence > History (1.80)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.02)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.02)
(16 more...)

Add feedback