Diagnosis
r/MachineLearning - [R] Deep Structural Causal Models for Tractable Counterfactual Inference
Abstract: We formulate a general framework for building structural causal models (SCMs) with deep learning components. The proposed approach employs normalising flows and variational inference to enable tractable inference of exogenous noise variables - a crucial step for counterfactual inference that is missing from existing deep causal learning methods. Our framework is validated on a synthetic dataset built on MNIST as well as on a real-world medical dataset of brain MRI scans. Our experimental results indicate that we can successfully train deep SCMs that are capable of all three levels of Pearl's ladder of causation: association, intervention, and counterfactuals, giving rise to a powerful new approach for answering causal questions in imaging applications and beyond. The code for all our experiments is available at this https URL.
Learning Causal Models Online
Javed, Khurram, White, Martha, Bengio, Yoshua
Predictive models -- learned from observational data not covering the complete data distribution -- can rely on spurious correlations in the data for making predictions. These correlations make the models brittle and hinder generalization. One solution for achieving strong generalization is to incorporate causal structures in the models; such structures constrain learning by ignoring correlations that contradict them. However, learning these structures is a hard problem in itself. Moreover, it's not clear how to incorporate the machinery of causality with online continual learning. In this work, we take an indirect approach to discovering causal models. Instead of searching for the true causal model directly, we propose an online algorithm that continually detects and removes spurious features. Our algorithm works on the idea that the correlation of a spurious feature with a target is not constant over-time. As a result, the weight associated with that feature is constantly changing. We show that by continually removing such features, our method converges to solutions that have strong generalization. Moreover, our method combined with random search can also discover non-spurious features from raw sensory data. Finally, our work highlights that the information present in the temporal structure of the problem -- destroyed by shuffling the data -- is essential for detecting spurious features online.
Do All Good Actors Look The Same? Exploring News Veracity Detection Across The U.S. and The U.K
Horne, Benjamin D., Gruppi, Maurรญcio, Adalฤฑ, Sibel
A major concern with text-based news veracity detection methods is that they may not generalize across countries and cultures. In this short paper, we explicitly test news veracity models across news data from the United States and the United Kingdom, demonstrating there is reason for concern of generalizabilty. Through a series of testing scenarios, we show that text-based classifiers perform poorly when trained on one country's news data and tested on another. Furthermore, these same models have trouble classifying unseen, unreliable news sources. In conclusion, we discuss implications of these results and avenues for future work.
An Analysis of the Adaptation Speed of Causal Models
Priol, Rรฉmi Le, Harikandeh, Reza Babanezhad, Bengio, Yoshua, Lacoste-Julien, Simon
We consider the problem of discovering the causal process that generated a collection of datasets. We assume that all these datasets were generated by unknown sparse interventions on a structural causal model (SCM) $G$, that we want to identify. Recently, Bengio et al. (2020) argued that among all SCMs, $G$ is the fastest to adapt from one dataset to another, and proposed a meta-learning criterion to identify the causal direction in a two-variable SCM. While the experiments were promising, the theoretical justification was incomplete. Our contribution is a theoretical investigation of the adaptation speed of simple two-variable SCMs. We use convergence rates from stochastic optimization to justify that a relevant proxy for adaptation speed is distance in parameter space after intervention. Using this proxy, we show that the SCM with the correct causal direction is advantaged for categorical and normal cause-effect datasets when the intervention is on the cause variable. When the intervention is on the effect variable, we provide a more nuanced picture which highlights that the fastest-to-adapt heuristic is not always valid. Code to reproduce experiments is available at https://github.com/remilepriol/causal-adaptation-speed
Adversarial Robustness Toolbox v1.2 releases: crafting and analysis of attacks and defense methods for machine learning models โข Penetration Testing
Adversarial Robustness 360 Toolbox (ART) is a Python library supporting developers and researchers in defending Machine Learning models (Deep Neural Networks, Gradient Boosted Decision Trees, Support Vector Machines, Random Forests, Logistic Regression, Gaussian Processes, Decision Trees, Scikit-learn Pipelines, etc.) against adversarial threats and helps making AI systems more secure and trustworthy. Machine Learning models are vulnerable to adversarial examples, which are inputs (images, texts, tabular data, etc.) deliberately modified to produce a desired response by the Machine Learning model. ART provides the tools to build and deploy defenses and test them with adversarial attacks. Defending Machine Learning models involves certifying and verifying model robustness and model hardening with approaches such as pre-processing inputs, augmenting training data with adversarial samples, and leveraging runtime detection methods to flag any inputs that might have been modified by an adversary. The attacks implemented in ART allow creating adversarial attacks against Machine Learning models which are required to test defenses with state-of-the-art threat models.
Machine Learning Diagnostic Algorithm Company Dascena Closes $50 Million IAM Network
Dascena, a machine learning diagnostic algorithm company that is targeting early disease intervention to improve patient care outcomes, announced it raised $50 million in Series B funding led by Frazier Healthcare Partners with participation from Longitude Capital, existing investor Euclidean Capital, and an undisclosed investor. This round of funding will enable Dascena to advance a suite of machine learning algorithms to inform patient care strategies and improve outcomes. And Dascena algorithms have been validated through eighteen peer-reviewed publications in several studies funded by the National Institutes of Health and the National Science Foundation. According to a randomized controlled trial of hospitalized patients in the intensive care unit (ICU), Dascena's InSight algorithm resulted in a 58% reduction in patient mortality and a 21% reduction in length of hospital stay. And data from this prospective study of InSight were published in the BMJ Open Respiratory Research in 2017.
Machine Learning: An Introduction to Decision Trees
Machine Learning for trading is the new buzz word today and some of the tech companies are doing wonderful unimaginable things with it. Today, we're going to show you, how you can predict stock movements (that's either up or down) with the help of'Decision Trees', one of the most commonly used ML algorithms. Decision trees in Machine Learning are used for building classification and regression models to be used in data mining and trading. A decision tree algorithm performs a set of recursive actions before it arrives at the end result and when you plot these actions on a screen, the visual looks like a big tree, hence the name'Decision Tree'. Basically, a decision tree is a flowchart to help you make decisions.
A Data Set of 255,000 Randomly Selected and Manually Classified Extracted Ion Chromatograms for Evaluation of Peak Detection Methods
Non-targeted mass spectrometry (MS) has become an important method over the last years in the fields of metabolomics and environmental research. While more and more algorithms and workflows become available to process a large number of data sets nontargeted, there still exist few manually evaluated universal test data sets for refining and evaluating these methods. The first step of non-targeted screening, peak detection (and refinement of it) is arguably the most important step for non-targeted screening. However, the absence of a model data set makes it harder for researchers to evaluate peak detection methods. In this Data Descriptor, we provide a manually checked data set consisting of 255,000 EICs (5000 peaks randomly sampled from across 51 samples) for the evaluation on peak detection and gap filling algorithms.
Japan lists 10,000 clinics offering online diagnoses for new patients
The health ministry has unveiled a list of more than 10,000 medical clinics accepting new patients for online diagnoses in an effort to curb the spread of the novel coronavirus among doctors and patients. In online meetings with patients, doctors provide recommendations and diagnoses remotely through technology such as smartphones. The method is said to be effective in protecting the medical system from the dangers of increased infections inside health facilities. The ministry said Friday that it will update the list of clinics providing telemedicine for first-time patients as it receives reports from local governments across the country. Amid the coronavirus pandemic, the ministry has modified its stance that the first consultation with each patient should be conducted face-to-face.