AITopics | Accuracy

Collaborating Authors

Accuracy

News Overviews Instructional Materials AI-Alerts Classics

Fair Adversarial Gradient Tree Boosting

Grari, Vincent, Ruf, Boris, Lamprier, Sylvain, Detyniecki, Marcin

arXiv.org Artificial IntelligenceNov-18-2019

--Fair classification has become an important topic in machine learning research. While most bias mitigation strategies focus on neural networks, we noticed a lack of work on fair classifiers based on decision trees even though they have proven very efficient. In an up-to-date comparison of state-of- the-art classification algorithms in tabular data, tree boosting outperforms deep learning [1]. For this reason, we have developed a novel approach of adversarial gradient tree boosting. The objective of the algorithm is to predict the output Y with gradient tree boosting while minimizing the ability of an adversarial neural network to predict the sensitive attribute S . The approach incorporates at each iteration the gradient of the neural network directly in the gradient tree boosting. We empirically assess our approach on 4 popular data sets and compare against state-of- the-art algorithms. The results show that our algorithm achieves a higher accuracy while obtaining the same level of fairness, as measured using a set of different common fairness definitions. I NTRODUCTION Machine learning models are increasingly used in decision making processes. In many fields of application, they generally deliver superior performance compared with conventional, deterministic algorithms. However, those models are mostly black boxes which are hard, if not impossible, to interpret.

algorithm, classifier, fairness, (15 more...)

arXiv.org Artificial Intelligence

1911.05369

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)
Banking & Finance (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)

Add feedback

Artificial intelligence examining ECGs may predict mortality, AF

#artificialintelligenceNov-17-2019, 16:36:49 GMT

Deep neural networks identified potential adverse outcomes and atrial fibrillation from 12-lead ECGs that were originally interpreted as normal, according to new research presented at the American Heart Association Scientific Sessions. "Applications of machine learning and artificial intelligence techniques to problems in health care are increasingly common, but generally focus on diagnostic problems such as detecting features in an image of classifying a current diagnosis based on present features," Christopher M. Haggerty, PhD, assistant professor in the department of imaging science and innovation, and Brandon K. Fornwalt, MD, PhD, associate professor and director of the department of imaging science and innovation, both at Geisinger in Danville, Pennsylvania, told Healio. "Few studies have been able to apply machine learning to the task of predicting future events or patient outcomes. This work is among the first to demonstrate proof of concept for predicting a future patient event -- 1-year mortality -- with good performance based solely on 12-lead electrocardiography data." Sushravya M. Raghunath, PhD, math and computational scientist in the department of imaging science and innovation at Geisinger, and colleagues analyzed 1,775,926 12-lead resting ECGs of 397,840 patients from 34 years of archived medical records.

imaging science and innovation, mortality, predict mortality, (13 more...)

#artificialintelligence

Country: North America > United States > Pennsylvania (0.26)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)

Add feedback

6 Metrics You Need to Optimize for Performance in Machine Learning - DZone AI

#artificialintelligenceNov-17-2019, 06:08:53 GMT

There are many metrics to measure the performance of your machine learning model depending on the type of machine learning you are looking to conduct. In this article, we take a look at performance measures for classification and regression models and discuss which is better-optimized. Sometimes the metric to look at will vary according to the problem that is initially being solved. The True Positive Rate, also called Recall, is the go-to performance measure in binary/non-binary classification problems. Most of the time -- if not all of the time -- we are only interested in correctly predicting one class.

accuracy, classification problem, diabetes, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

A Configuration-Space Decomposition Scheme for Learning-based Collision Checking

Han, Yiheng, Zhao, Wang, Pan, Jia, Ye, Zipeng, Yi, Ran, Liu, Yong-Jin

arXiv.org Machine LearningNov-17-2019

A Configuration-Space Decomposition Scheme for Learning-based Collision Checking Yiheng Han 1, Wang Zhao 1, Jia Pan 2, Zipeng Y e 1, Ran Yi 1 and Y ong-Jin Liu 1† Abstract -- Motion planning for robots of high degrees-of- freedom (DOFs) is an important problem in robotics with sampling-based methods in configuration space C as one popular solution. Recently, machine learning methods have been introduced into sampling-based motion planning methods, which train a classifier to distinguish collision free subspace from in-collision subspace in C . In this paper, we propose a novel configuration space decomposition method and show two nice properties resulted from this decomposition. Using these two properties, we build a composite classifier that works compatibly with previous machine learning methods by using them as the elementary classifiers. Experimental results are presented, showing that our composite classifier outperforms state-of-the-art single-classifier methods by a large margin. A real application of motion planning in a multi-robot system in plant phenotyping using three UR5 robotic arms is also presented. I. INTRODUCTION Motion planning plays an important role in robotics, which finds a collision-free path to move a robot from a source to a target position.

classifier, deep learning, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

1911.08581

Country:

Asia > China (0.28)
Asia > Singapore (0.14)
Oceania > Australia (0.14)
(3 more...)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.72)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Predicting colorectal polyp recurrence using time-to-event analysis of medical records

Harrington, Lia X., Wei, Jason W., Suriawinata, Arief A., Mackenzie, Todd A., Hassanpour, Saeed

arXiv.org Machine LearningNov-17-2019

Identifying patient characteristics that influence the rate of colorectal polyp recurrence can provide important insights into which patients are at higher risk for recurrence. We used natural language processing to extract polyp morphological characteristics from 953 polyp-presenting patients' electronic medical records. We used subsequent colonoscopy reports to examine how the time to polyp recurrence (731 patients experienced recurrence) is influenced by these characteristics as well as anthropometric features using Kaplan-Meier curves, Cox proportional hazards modeling, and random survival forest models. We found that the rate of recurrence differed significantly by polyp size, number, and location and patient smoking status. Additionally, right-sided colon polyps increased recurrence risk by 30% compared to left-sided polyps. History of tobacco use increased polyp recurrence risk by 20% compared to never-users. A random survival forest model showed an AUC of 0.65 and identified several other predictive variables, which can inform development of personalized polyp surveillance plans.

polyp, polyp recurrence, recurrence, (13 more...)

arXiv.org Machine Learning

1911.07368

Country:

Asia > Middle East > Lebanon (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Oregon > Washington County > Beaverton (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.94)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (0.96)
Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Overcoming Practical Issues of Deep Active Learning and its Applications on Named Entity Recognition

Chang, Haw-Shiuan, Vembu, Shankar, Mohan, Sunil, Uppaal, Rheeya, McCallum, Andrew

arXiv.org Machine LearningNov-17-2019

Existing deep active learning algorithms achieve impressive sampling efficiency on natural language processing tasks. However, they exhibit several weaknesses in practice, including (a) inability to use uncertainty sampling with black-box models, (b) lack of robustness to noise in labeling, (c) lack of transparency. In response, we propose a transparent batch active sampling framework by estimating the error decay curves of multiple feature-defined subsets of the data. Experiments on four named entity recognition (NER) tasks demonstrate that the proposed methods significantly outperform diversification-based methods for black-box NER taggers and can make the sampling process more robust to labeling noise when combined with uncertainty-based methods. Furthermore, the analysis of experimental results sheds light on the weaknesses of different active sampling strategies, and when traditional uncertainty-based or diversification-based methods can be expected to work well.

active learning, dataset, learning, (14 more...)

arXiv.org Machine Learning

1911.07335

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Defending Against Model Stealing Attacks with Adaptive Misinformation

Kariyappa, Sanjay, Qureshi, Moinuddin K

arXiv.org Machine LearningNov-16-2019

Deep Neural Networks (DNNs) are susceptible to model stealing attacks, which allows a data-limited adversary with no knowledge of the training dataset to clone the functionality of a target model, just by using black-box query access. Such attacks are typically carried out by querying the target model using inputs that are synthetically generated or sampled from a surrogate dataset to construct a labeled dataset. The adversary can use this labeled dataset to train a clone model, which achieves a classification accuracy comparable to that of the target model. We propose "Adaptive Misinformation" to defend against such model stealing attacks. We identify that all existing model stealing attacks invariably query the target model with Out-Of-Distribution (OOD) inputs. By selectively sending incorrect predictions for OOD queries, our defense substantially degrades the accuracy of the attacker's clone model (by up to 40%), while minimally impacting the accuracy (<0.5%) for benign users. Compared to existing defenses, our defense has a significantly better security vs accuracy trade-off and incurs minimal computational overhead.

accuracy, adversary, prediction, (15 more...)

arXiv.org Machine Learning

1911.071

Country: North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Media > News (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

An "outside the box" solution for imbalanced data classification

Jegierski, Hubert, Saganowski, Stanisław

arXiv.org Machine LearningNov-16-2019

A common problem of the real-world data sets is the class imbalance, which can significantly affect the classification abilities of classifiers. Numerous methods have been proposed to cope with this problem; however, even state-of-the-art methods offer a limited improvement (if any) for data sets with critically under-represented minority classes. For such problematic cases, an "outside the box" solution is required. Therefore, we propose a novel technique, called enrichment, which uses the information (observations) from the external data set(s). We present three approaches to implement enrichment technique: (1) selecting observations randomly, (2) iteratively choosing observations that improve the classification result, (3) adding observations that help the classifier to determine the border between classes better. We then thoroughly analyze developed solutions on ten real-world data sets to experimentally validate their usefulness. On average, our best approach improves the classification quality by 27\%, and in the best case, by outstanding 66\%. We also compare our technique with the universally applicable state-of-the-art methods. We find that our technique surpasses the existing methods performing, on average, 21\% better. The advantage is especially noticeable for the smallest data sets, for which existing methods failed, while our solutions achieved the best results. Additionally, our technique applies to both the multi-class and binary classification tasks. It can also be combined with other techniques dealing with the class imbalance problem.

classification, external data, nan 0, (16 more...)

arXiv.org Machine Learning

1911.06965

Country: Europe > Poland > Lower Silesia Province > Wroclaw (0.04)

Genre:

Research Report > Promising Solution (1.00)
Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Causality-based Feature Selection: Methods and Evaluations

Yu, Kui, Guo, Xianjie, Liu, Lin, Li, Jiuyong, Wang, Hao, Ling, Zhaolong, Wu, Xindong

arXiv.org Artificial IntelligenceNov-16-2019

Feature selection is a crucial preprocessing step in data analytics and machine learning. Classical feature selection algorithms select features based on the correlations between predictive features and the class variable and do not attempt to capture causal relationships between them. It has been shown that the knowledge about the causal relationships between features and the class variable has potential benefits for building interpretable and robust prediction models, since causal relationships imply the underlying mechanism of a system. Consequently, causality-based feature selection has gradually attracted greater attentions and many algorithms have been proposed. In this paper, we present a comprehensive review of recent advances in causality-based feature selection. To facilitate the development of new algorithms in the research area and make it easy for the comparisons between new methods and existing ones, we develop the first open-source package, called CausalFS, which consists of most of the representative causality-based feature selection algorithms (available at https://github.com/kuiy/CausalFS). Using CausalFS, we conduct extensive experiments to compare the representative algorithms with both synthetic and real-world data sets. Finally, we discuss some challenging problems to be tackled in future causality-based feature selection research.

algorithm, class variable, cpc, (16 more...)

arXiv.org Artificial Intelligence

1911.07147

Country:

Asia > China > Anhui Province > Hefei (0.04)
Oceania > Australia > South Australia (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Add feedback

Robust Anomaly Detection and Backdoor Attack Detection Via Differential Privacy

Du, Min, Jia, Ruoxi, Song, Dawn

arXiv.org Artificial IntelligenceNov-16-2019

Outlier detection and novelty detection are two important topics for anomaly detection. Suppose the majority of a dataset are drawn from a certain distribution, outlier detection and novelty detection both aim to detect data samples that do not fit the distribution. Outliers refer to data samples within this dataset, while novelties refer to new samples. In the meantime, backdoor poisoning attacks for machine learning models are achieved through injecting poisoning samples into the training dataset, which could be regarded as "outliers" that are intentionally added by attackers. Differential privacy has been proposed to avoid leaking any individual's information, when aggregated analysis is performed on a given dataset. It is typically achieved by adding random noise, either directly to the input dataset, or to intermediate results of the aggregation mechanism. In this paper, we demonstrate that applying differential privacy can improve the utility of outlier detection and novelty detection, with an extension to detect poisoning samples in backdoor attacks. We first present a theoretical analysis on how differential privacy helps with the detection, and then conduct extensive experiments to validate the effectiveness of differential privacy in improving outlier detection, novelty detection, and backdoor attack detection.

detection, differential privacy, outlier, (14 more...)

arXiv.org Artificial Intelligence

1911.07116

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback