AITopics | Diagnosis

Collaborating Authors

Diagnosis

News Overviews Instructional Materials AI-Alerts Classics

On the limits of cross-domain generalization in automated X-ray prediction

Cohen, Joseph Paul, Hashir, Mohammad, Brooks, Rupert, Bertrand, Hadrien

arXiv.org Machine LearningFeb-6-2020

This large scale study focuses on quantifying what X-rays diagnostic prediction tasks generalize well across multiple different datasets. We present evidence that the issue of generalization is not due to a shift in the images but instead a shift in the labels. We study the cross-domain performance, agreement between models, and model representations. We find interesting discrepancies between performance and agreement where models which both achieve good performance disagree in their predictions as well as models which agree yet achieve poor performance. We also test for concept similarity by regularizing a network to group tasks across multiple datasets together and observe variation across the tasks.

dataset, kaggle, vector, (17 more...)

arXiv.org Machine Learning

2002.02497

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)
North America > United States > Indiana (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.76)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.68)

Add feedback

Inferring Individual Level Causal Models from Graph-based Relational Time Series

Rossi, Ryan, Sarkhel, Somdeb, Ahmed, Nesreen

arXiv.org Machine LearningJan-23-2020

In this work, we formalize the problem of causal inference over graph-based relational time-series data where each node in the graph has one or more time-series associated to it. We propose causal inference models for this problem that leverage both the graph topology and time-series to accurately estimate local causal effects of nodes. Furthermore, the relational time-series causal inference models are able to estimate local effects for individual nodes by exploiting local node-centric temporal dependencies and topological/structural dependencies. We show that simpler causal models that do not consider the graph topology are recovered as special cases of the proposed relational time-series causal inference model. We describe the conditions under which the resulting estimate can be used to estimate a causal effect, and describe how the Durbin-Wu-Hausman test of specification can be used to test for the consistency of the proposed estimator from data. Empirically, we demonstrate the effectiveness of the causal inference models on both synthetic data with known ground-truth and a large-scale observational relational time-series data set collected from Wikipedia.

causal effect, causal inference model, node, (13 more...)

arXiv.org Machine Learning

2001.05993

Country:

Oceania > Australia > Victoria (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.61)

Add feedback

Secure and Robust Machine Learning for Healthcare: A Survey

Qayyum, Adnan, Qadir, Junaid, Bilal, Muhammad, Al-Fuqaha, Ala

arXiv.org Machine LearningJan-21-2020

Recent years have witnessed widespread adoption of machine learning (ML)/deep learning (DL) techniques due to their superior performance for a variety of healthcare applications ranging from the prediction of cardiac arrest from one-dimensional heart signals to computer-aided diagnosis (CADx) using multi-dimensional medical images. Notwithstanding the impressive performance of ML/DL, there are still lingering doubts regarding the robustness of ML/DL in healthcare settings (which is traditionally considered quite challenging due to the myriad security and privacy issues involved), especially in light of recent results that have shown that ML/DL are vulnerable to adversarial attacks. In this paper, we present an overview of various application areas in healthcare that leverage such techniques from security and privacy point of view and present associated challenges. In addition, we present potential methods to ensure secure and privacy-preserving ML for healthcare applications. Finally, we provide insight into the current research challenges and promising directions for future research.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2001.08103

Country:

North America > United States > Massachusetts (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > Bristol (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
(5 more...)

Add feedback

Understanding Decision Tree Classification with Scikit-Learn

#artificialintelligenceJan-15-2020, 09:07:15 GMT

Gini Impurity is named after the Italian statistician Corrado Gini. Gini impurity can be understood as a criterion to minimize the probability of misclassification. To understand the definition (as shown in the figure) and exactly how we can build up a decision tree, let's get started with a very simple data-set, where depending on various weather conditions, we decide whether to play an outdoor game or not. From the definition, a data-set containing only one class will have 0 Gini Impurity. In building up the decision tree our idea is to choose the feature with least Gini Impurity as root node and so on... Let's get started with the simple data-set -- Here we see that depending on 4 features (Outlook, Temperature, Humidity, Wind), decision is made on whether to play tennis or not.

decision tree classification, gini impurity, node, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.93)

Add feedback

Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment

Chakraborty, Sayan, Shah, Smit, Soltani, Kiumars, Swigart, Anna

arXiv.org Machine LearningJan-4-2020

The recent increase in the scale and complexity of software systems has introduced new challenges to the time series monitoring and anomaly detection process. A major drawback of existing anomaly detection methods is that they lack contextual information to help stakeholders identify the cause of anomalies. This problem, known as root cause detection, is particularly challenging to undertake in today's complex distributed software systems since the metrics under consideration generally have multiple internal and external dependencies. Significant manual analysis and strong domain expertise is required to isolate the correct cause of the problem. In this paper, we propose a method that isolates the root cause of an anomaly by analyzing the patterns in time series fluctuations. Our method considers the time series as observations from an underlying process passing through a sequence of discretized hidden states. The idea is to track the propagation of the effect when a given problem causes unaligned but homogeneous shifts of the underlying states. We evaluate our approach by finding the root cause of anomalies in Zillows clickstream data by identifying causal patterns among a set of observed fluctuations.

anomaly, detection, time sery, (14 more...)

arXiv.org Machine Learning

2001.01056

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > Texas > Bexar County > San Antonio (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
(2 more...)

Add feedback

Knowledge-Induced Learning with Adaptive Sampling Variational Autoencoders for Open Set Fault Diagnostics

Chao, Manuel Arias, Adey, Bryan T., Fink, Olga

arXiv.org Machine LearningDec-28-2019

The recent increase in the availability of system condition monitoring data has lead to increases in the use of data-driven approaches for fault diagnostics. The accuracy of the fault detection and classification using these approaches is generally good when abundant labelled data on healthy and faulty system conditions exists and the diagnosis problem is formulated as a supervised learning task, i.e. supervised fault diagnosis. It is, however, relatively common in real situations that only a small fraction of the system condition monitoring data are labeled as healthy and the rest is unlabeled due to the uncertainty of the number and type of faults that may occur. In this case, supervised fault diagnosis performs poorly. Fault diagnosis with an unknown number and nature of faults is an open set learning problem where the knowledge of the faulty system is incomplete during training and the number and extent of the faults, of different types, can evolve during testing. In this paper, we propose to formulate the open set diagnostics problem as a semi-supervised learning problem and we demonstrate how it can be solved using a knowledge-induced learning approach with adaptive sampling variational autoencoders (KIL-AdaVAE) in combination with a one-class classifier. The fault detection and segmentation capability of the proposed method is demonstrated on a simulated case study using the Advanced Geared Turbofan 30000 (AGTF30) dynamical model under real flight conditions and induced faults of 17 fault types. The performance of the method is compared to the different learning strategies (supervised learning, supervised learning with embedding and semi-supervised learning) and deep learning algorithms. The results demonstrate that the proposed method is able to significantly outperform all other tested methods in terms of fault detection and fault segmentation.

information, representation, system condition, (16 more...)

arXiv.org Machine Learning

1912.12502

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
North America > United States > Virginia > Fairfax County > Reston (0.04)
Europe > Italy > Sardinia (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Renewable (0.67)
Health & Medicine > Consumer Health (0.46)
Education > Focused Education > Special Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(2 more...)

Add feedback

mRMR-DNN with Transfer Learning for IntelligentFault Diagnosis of Rotating Machines

Singh, Vikas, Verma, Nishchal K.

arXiv.org Machine LearningDec-25-2019

In recent years, intelligent condition-based monitoring of rotary machinery systems has become a major research focus of machine fault diagnosis. In condition-based monitoring, it is challenging to form a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation. Along with that, the generated data have a large number of redundant features which degraded the performance of the machine learning models. To overcome this, we have utilized the advantages of minimum redundancy maximum relevance (mRMR) and transfer learning with deep learning model. In this work, mRMR is combined with deep learning and deep transfer learning framework to improve the fault diagnostics performance in term of accuracy and computational complexity. The mRMR reduces the redundant information from data and increases the deep learning performance, whereas transfer learning, reduces a large amount of data dependency for training the model. In the proposed work, two frameworks, i.e., mRMR with deep learning and mRMR with deep transfer learning, have explored and validated on CWRU and IMS rolling element bearings datasets. The analysis shows that the proposed frameworks are able to obtain better diagnostic accuracy in comparison of existing methods and also able to handle the data with a large number of features more quickly.

diagnosis, fault diagnosis, ieee transaction, (15 more...)

arXiv.org Machine Learning

1912.11235

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Diagnostic Medicine (0.54)
Health & Medicine > Consumer Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

r/MachineLearning - [D] Decision Tree Splitting strategy

#artificialintelligenceDec-24-2019, 13:39:34 GMT

I have a dataset with 4 categorical features (Cholesterol, Systolic Blood pressure, diastolic blood pressure, and smoking rate). I use a decision tree classifier to find the probability of stroke. I am trying to verify my understanding of the splitting procedure done by Python Sklearn. Since it is a binary tree, there are three possible ways to split the first feature which is either to group categories {0 and 1 to a leaf, 2 to another leaf} or {0 and 2, 1}, or {0, 1 and 2}. What I know (please correct me here) is that the chosen split is the one with the highest information gain.

decision tree splitting strategy, information gain, machinelearning, (1 more...)

#artificialintelligence

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Communications > Social Media (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.71)

Add feedback

Efficient Partial Dependence Plots with decision trees

#artificialintelligenceDec-21-2019, 22:40:42 GMT

Partial Dependence Plots (PDPs) are a standard inspection technique for machine learning models. This post will describe both techniques, and explain why the fast way is… well, faster. We will also see that they are not always equivalent. We will briefly describe partial dependence functions. For a more thorough introduction to PDPs, you can refer to the Bible, or to the Interpretable Machine Learning book.

fake sample, proportion, training data, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.40)

Add feedback

Mislabel Detection of Finnish Publication Ranks

Akusok, Anton, Saarela, Mirka, Kärkkäinen, Tommi, Björk, Kaj-Mikael, Lendasse, Amaury

arXiv.org Machine LearningDec-19-2019

Finland, in the spirit of Norway and Denmark, introduced ranking system for academic publication channels (referring to scientific journals, conference series, book publishers etc.) called as Jufo (i.e. "Julkaisufoorumi" in Finnish, "Publication Forum" in English) in 2010, together with the renewed university legislation. The ranking of a publication channel, ranging from 0 (non-peer- reviewed) to 3 (most distinguished academic publication forums), is decided by a specially nominated panel of a particular scientific discipline. These panels decide the rankings based on their academic expertise in regular meetings. Because the rankings are directly linked to the allocated funding of the universities, there has been and is a lot of discussion about the fairness and objectivity of the ranks. A versatile analysis of the 2015 Jufo-rankings was done in [10]. There, by using association rule mining, decision trees, and confusion matrices with respect to Norwegian and Danish ranks, it was shown that most of the expert-based rankings could be predicted and explained with machine learning methods. Moreover, it was found out that those publication channels, for which the Finnish expert-based rank is higher than the estimated one, are characterized by higher publication activity or recent upgrade of the rank. Hence, the outcomes of the system, the publication ranks, need to be assessed and evaluated regularly and rigorously. 1

binary variable, mislabeled sample, publication channel, (12 more...)

arXiv.org Machine Learning

1912.09094

Country:

Europe > Norway (0.24)
Europe > Denmark (0.24)
North America > United States > Iowa > Johnson County > Iowa City (0.05)
(2 more...)

Genre: Research Report (1.00)

Industry: Government (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.35)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.34)

Add feedback