AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Appropriateness of Performance Indices for Imbalanced Data Classification: An Analysis

Mullick, Sankha Subhra, Datta, Shounak, Dhekane, Sourish Gunesh, Das, Swagatam

arXiv.org Machine LearningAug-26-2020

Indices quantifying the performance of classifiers under class-imbalance, often suffer from distortions depending on the constitution of the test set or the class-specific classification accuracy, creating difficulties in assessing the merit of the classifier. We identify two fundamental conditions that a performance index must satisfy to be respectively resilient to altering number of testing instances from each class and the number of classes in the test set. In light of these conditions, under the effect of class imbalance, we theoretically analyze four indices commonly used for evaluating binary classifiers and five popular indices for multi-class classifiers. For indices violating any of the conditions, we also suggest remedial modification and normalization. We further investigate the capability of the indices to retain information about the classification performance over all the classes, even when the classifier exhibits extreme performance on some classes. Simulation studies are performed on high dimensional deep representations of subset of the ImageNet dataset using four state-of-the-art classifiers tailored for handling class imbalance. Finally, based on our theoretical findings and empirical evidence, we recommend the appropriate indices that should be used to evaluate the performance of classifiers in presence of class-imbalance.

artificial intelligence, classifier, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.patcog.2020.107197

2008.11752

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
Asia > India > West Bengal > Kolkata (0.04)
Asia > India > Assam > Guwahati (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Improving Fairness in Criminal Justice Algorithmic Risk Assessments Using Conformal Prediction Sets

Berk, Richard A., Kuchibhotla, Arun Kumar

arXiv.org Machine LearningAug-26-2020

Risk assessment algorithms have been correctly criticized for potential unfairness, and there is an active cottage industry trying to make repairs. In this paper, we adopt a framework from conformal prediction sets to remove unfairness from risk algorithms themselves and the covariates used for forecasting. From a sample of 300,000 offenders at their arraignments, we construct a confusion table and its derived measures of fairness that are effectively free any meaningful differences between Black and White offenders. We also produce fair forecasts for individual offenders coupled with valid probability guarantees that the forecasted outcome is the true outcome. We see our work as a demonstration of concept for application in a wide variety of criminal justice decisions. The procedures provided can be routinely implemented in jurisdictions with the usual criminal justice datasets used by administrators. The requisite procedures can be found in the scripting software R. However, whether stakeholders will accept our approach as a means to achieve risk assessment fairness is unknown. There also are legal issues that would need to be resolved although we offer a Pareto improvement.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2008.11664

Country:

North America > United States > New York (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (0.91)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Security & Privacy (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Plotting a Confusion Matrix- Machine Learning in Python

#artificialintelligenceAug-25-2020, 06:05:48 GMT

In this blog post, I will be explaining how to plot confusion matrices in Python. This is my second blog post on the Confusion Matrix. If you want to understand what a confusion matrix is and how to get insights from the confusion matrix, check out my first blog post. I have attached the link below. Now, without further due, let's dive into how to plot a confusion matrix.

artificial intelligence, confusion matrix, machine learning, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

CareCall: a Call-Based Active Monitoring Dialog Agent for Managing COVID-19 Pandemic

Lee, Sang-Woo, Jung, Hyunhoon, Ko, SukHyun, Kim, Sunyoung, Kim, Hyewon, Doh, Kyoungtae, Park, Hyunjung, Yeo, Joseph, Ok, Sang-Houn, Lee, Joonhaeng, Lim, Sungsoon, Jeong, Minyoung, Choi, Seongjae, Hwang, SeungTae, Park, Eun-Young, Ma, Gwang-Ja, Han, Seok-Joo, Cha, Kwang-Seung, Sung, Nako, Ha, Jung-Woo

arXiv.org Artificial IntelligenceAug-25-2020

Tracking suspected cases of COVID-19 is crucial to suppressing the spread of COVID-19 pandemic. Active monitoring and proactive inspection are indispensable to mitigate COVID-19 spread, though these require considerable social and economic expense. To address this issue, we introduce CareCall, a call-based dialog agent which is deployed for active monitoring in Korea and Japan. We describe our system with a case study with statistics to show how the system works. Finally, we discuss a simple idea which uses CareCall to support proactive inspection.

carecall, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2007.02642

Country:

Asia > Japan (0.25)
Europe > United Kingdom (0.04)
Asia > South Korea (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.32)

Add feedback

Context-Dependent Implicit Authentication for Wearable Device User

Cheung, William, Vhaduri, Sudip

arXiv.org Machine LearningAug-25-2020

As market wearables are becoming popular with a range of services, including making financial transactions, accessing cars, etc. that they provide based on various private information of a user, security of this information is becoming very important. However, users are often flooded with PINs and passwords in this internet of things (IoT) world. Additionally, hard-biometric, such as facial or finger recognition, based authentications are not adaptable for market wearables due to their limited sensing and computation capabilities. Therefore, it is a time demand to develop a burden-free implicit authentication mechanism for wearables using the less-informative soft-biometric data that are easily obtainable from the market wearables. In this work, we present a context-dependent soft-biometric-based wearable authentication system utilizing the heart rate, gait, and breathing audio signals. From our detailed analysis, we find that a binary support vector machine (SVM) with radial basis function (RBF) kernel can achieve an average accuracy of $0.94 \pm 0.07$, $F_1$ score of $0.93 \pm 0.08$, an equal error rate (EER) of about $0.06$ at a lower confidence threshold of 0.52, which shows the promise of this work.

artificial intelligence, authentication, machine learning, (18 more...)

arXiv.org Machine Learning

2008.12145

Country:

North America > United States (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.86)

Add feedback

SOAR: Simultaneous Or of And Rules for Classification of Positive & Negative Classes

Khusainova, Elena, Dodwell, Emily, Mitra, Ritwik

arXiv.org Machine LearningAug-25-2020

Algorithmic decision making has proliferated and now impacts our daily lives in both mundane and consequential ways. Machine learning practitioners make use of a myriad of algorithms for predictive models in applications as diverse as movie recommendations, medical diagnoses, and parole recommendations without delving into the reasons driving specific predictive decisions. Machine learning algorithms in such applications are often chosen for their superior performance, however popular choices such as random forest and deep neural networks fail to provide an interpretable understanding of the predictive model. In recent years, rule-based algorithms have been used to address this issue. Wang et al. (2017) presented an or-of-and (disjunctive normal form) based classification technique that allows for classification rule mining of a single class in a binary classification; this method is also shown to perform comparably to other modern algorithms. In this work, we extend this idea to provide classification rules for both classes simultaneously. That is, we provide a distinct set of rules for both positive and negative classes. In describing this approach, we also present a novel and complete taxonomy of classifications that clearly capture and quantify the inherent ambiguity in noisy binary classifications in the real world. We show that this approach leads to a more granular formulation of the likelihood model and a simulated-annealing based optimization achieves classification performance competitive with comparable techniques. We apply our method to synthetic as well as real world data sets to compare with other related methods that demonstrate the utility of our proposal.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2008.11249

Country:

Europe (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

NFL has 77 apparently false positive COVID-19 tests from lab

FOX NewsAug-24-2020, 05:46:32 GMT

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. NEW YORK (AP) -- The NFL had 77 positive COVID-19 tests from 11 teams re-examined by a New Jersey lab after false positives, and all those tests came back negative. The league asked the New Jersey lab BioReference to investigate the results, and those 77 tests are being re-tested once more to make sure they were false positives. Among teams reporting false positives, the Minnesota Vikings said they had 12, the New York Jets 10 and the Chicago Bears nine.

artificial intelligence, false positive, machine learning, (13 more...)

FOX News

Country:

North America > United States > New York (0.77)
North America > United States > New Jersey (0.48)
North America > United States > Minnesota (0.25)
(2 more...)

Industry:

Leisure & Entertainment > Sports > Football (1.00)
Health & Medicine > Therapeutic Area (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Variable selection for Gaussian process regression through a sparse projection

Park, Chiwoo, Borth, David J., Wilson, Nicholas S., Hunter, Chad N.

arXiv.org Machine LearningAug-24-2020

This paper presents a new variable selection approach integrated with Gaussian process (GP) regression. We consider a sparse projection of input variables and a general stationary covariance model that depends on the Euclidean distance between the projected features. The sparse projection matrix is considered as an unknown parameter. We propose a forward stagewise approach with embedded gradient descent steps to co-optimize the parameter with other covariance parameters based on the maximization of a non-convex marginal likelihood function with a concave sparsity penalty, and some convergence properties of the algorithm are provided. The proposed model covers a broader class of stationary covariance functions than the existing automatic relevance determination approaches, and the solution approach is more computationally feasible than the existing MCMC sampling procedures for the automatic relevance parameter estimation with a sparsity prior. The approach is evaluated for a large number of simulated scenarios. The choice of tuning parameters and the accuracy of the parameter estimation are evaluated with the simulation study. In the comparison to some chosen benchmark approaches, the proposed approach has provided a better accuracy in the variable selection. It is applied to an important problem of identifying environmental factors that affect an atmospheric corrosion of metal alloys.

bayesian inference, regression, upstream oil & gas, (21 more...)

arXiv.org Machine Learning

2008.10769

Country: North America > United States (0.67)

Genre: Research Report (0.82)

Industry:

Government > Military (0.67)
Energy > Oil & Gas > Upstream (0.48)
Materials > Metals & Mining (0.48)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
(2 more...)

Add feedback

Towards Stable Imbalanced Data Classification via Virtual Big Data Projection

Mansourifar, Hadi, Shi, Weidong

arXiv.org Machine LearningAug-23-2020

Virtual Big Data (VBD) proved to be effective to alleviate mode collapse and vanishing generator gradient as two major problems of Generative Adversarial Neural Networks (GANs) very recently. In this paper, we investigate the capability of VBD to address two other major challenges in Machine Learning including deep autoencoder training and imbalanced data classification. First, we prove that, VBD can significantly decrease the validation loss of autoencoders via providing them a huge diversified training data which is the key to reach better generalization to minimize the over-fitting problem. Second, we use the VBD to propose the first projection-based method called cross-concatenation to balance the skewed class distributions without over-sampling. We prove that, cross-concatenation can solve uncertainty problem of data driven methods for imbalanced classification.

artificial intelligence, autoencoder, machine learning, (15 more...)

arXiv.org Machine Learning

2009.08387

Country:

North America > United States > Texas > Harris County > Houston (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.94)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Upsampling Minority Classes in Imbalanced Text Classification Problems Using Markov Chains

#artificialintelligenceAug-22-2020, 18:56:13 GMT

Classification problems in supervised machine learning are often troubled by the issue of imbalanced class sizes. Given binary classified data, an imbalanced stratification of the two classes will bias the predictions of a model fit to it. A model trained on data made up of 1,000 samples labeled class "0" and 100 samples labeled class "1" could naively predict class "0" for every test instance and report 90% accuracy. Such an accuracy score is deceptive, as the model is not actually "learning" any trends from the data. This can cause serious problems in deployment.

artificial intelligence, machine learning, minority class, (15 more...)

#artificialintelligence

Country: North America > United States (0.15)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.49)

Add feedback