AITopics

2503.09969

Country:

North America > United States > New York (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Providers & Services (0.93)
Health & Medicine > Therapeutic Area > Dermatology (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

arXiv.org Machine LearningFeb-21-2024

A hierarchical decomposition for explaining ML performance discrepancies

Feng, Jean, Singh, Harvineet, Xia, Fan, Subbaswamy, Adarsh, Gossmann, Alexej

Machine learning (ML) algorithms can often differ in performance across domains. Understanding $\textit{why}$ their performance differs is crucial for determining what types of interventions (e.g., algorithmic or operational) are most effective at closing the performance gaps. Existing methods focus on $\textit{aggregate decompositions}$ of the total performance gap into the impact of a shift in the distribution of features $p(X)$ versus the impact of a shift in the conditional distribution of the outcome $p(Y|X)$; however, such coarse explanations offer only a few options for how one can close the performance gap. $\textit{Detailed variable-level decompositions}$ that quantify the importance of each variable to each term in the aggregate decomposition can provide a much deeper understanding and suggest much more targeted interventions. However, existing methods assume knowledge of the full causal graph or make strong parametric assumptions. We introduce a nonparametric hierarchical framework that provides both aggregate and detailed decompositions for explaining why the performance of an ML algorithm differs across domains, without requiring causal knowledge. We derive debiased, computationally-efficient estimators, and statistical inference procedures for asymptotically valid confidence intervals.

artificial intelligence, decomposition, machine learning, (19 more...)

2402.14254

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Health Care Providers & Services (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

arXiv.org Machine LearningNov-19-2023

Towards a Post-Market Monitoring Framework for Machine Learning-based Medical Devices: A case study

Feng, Jean, Subbaswamy, Adarsh, Gossmann, Alexej, Singh, Harvineet, Sahiner, Berkman, Kim, Mi-Ok, Pennello, Gene, Petrick, Nicholas, Pirracchio, Romain, Xia, Fan

After a machine learning (ML)-based system is deployed in clinical practice, performance monitoring is important to ensure the safety and effectiveness of the algorithm over time. The goal of this work is to highlight the complexity of designing a monitoring strategy and the need for a systematic framework that compares the multitude of monitoring options. One of the main decisions is choosing between using real-world (observational) versus interventional data. Although the former is the most convenient source of monitoring data, it exhibits well-known biases, such as confounding, selection, and missingness. In fact, when the ML algorithm interacts with its environment, the algorithm itself may be a primary source of bias. On the other hand, a carefully designed interventional study that randomizes individuals can explicitly eliminate such biases, but the ethics, feasibility, and cost of such an approach must be carefully considered. Beyond the decision of the data source, monitoring strategies vary in the performance criteria they track, the interpretability of the test statistics, the strength of their assumptions, and their speed at detecting performance decay. As a first step towards developing a framework that compares the various monitoring options, we consider a case study of an ML-based risk prediction algorithm for postoperative nausea and vomiting (PONV). Bringing together tools from causal inference and statistical process control, we walk through the basic steps of defining candidate monitoring criteria, describing potential sources of bias and the causal model, and specifying and comparing candidate monitoring procedures. We hypothesize that these steps can be applied more generally, as causal inference can address other sources of biases as well.

algorithm, artificial intelligence, machine learning, (16 more...)

2311.11463

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.47)
Government > Regional Government > North America Government > United States Government > FDA (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

arXiv.org Artificial IntelligenceNov-28-2022

Machine Learning for Health symposium 2022 -- Extended Abstract track

Parziale, Antonio, Agrawal, Monica, Joshi, Shalmali, Chen, Irene Y., Tang, Shengpu, Oala, Luis, Subbaswamy, Adarsh

A collection of the extended abstracts that were presented at the 2nd Machine Learning for Health symposium (ML4H 2022), which was held both virtually and in person on November 28, 2022, in New Orleans, Louisiana, USA. Machine Learning for Health (ML4H) is a longstanding venue for research into machine learning for health, including both theoretical works and applied works. ML4H 2022 featured two submission tracks: a proceedings track, which encompassed full-length submissions of technically mature and rigorous work, and an extended abstract track, which would accept less mature, but innovative research for discussion. All the manuscripts submitted to ML4H Symposium underwent a double-blind peer-review process. Extended abstracts included in this collection describe innovative machine learning research focused on relevant problems in health and biomedicine.

artificial intelligence, extended abstract track, machine learning, (1 more...)

2211.15564

Country: North America > United States > Louisiana > Orleans Parish > New Orleans (0.24)

Genre: Research Report (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningOct-28-2020

Evaluating Model Robustness to Dataset Shift

Subbaswamy, Adarsh, Adams, Roy, Saria, Suchi

The environments in which we deploy machine learning (ML) algorithms rarely look exactly like the environments in which we collected our training data. Unfortunately, we lack methodology for evaluating how well an algorithm will generalize to new environments that differ in a structured way from the training data (i.e., the case of dataset shift (Quiñonero-Candela et al., 2009)). Such methodology is increasingly important as ML systems are being deployed across a number of industries, such as health care and personal finance, in which system performance translates directly to real-world outcomes. Further, as regulation and product reviews become more common across industries, system developers will be expected to produce evidence of the validity and safety of their systems. For example, the United States Food and Drug Administration (FDA) currently regulates ML systems for medical applications, requiring evidence for the validity of such systems before approval is granted (US Food and Drug Administration, 2019). Evaluation methods for assessing model validity have typically focused on how the model performs on data from the training distribution, known as internal validity. Powerful tools, such as cross-validation and the bootstrap, satisfy the assumption that the training and test data are drawn from the same distribution. However, these validation methods do not capture a model's ability to generalize to new environments, known as external validity (Campbell and Stanley, 1963). Currently, the main way to assess a model's external validity is to empirically evaluate performance on multiple, independently collected datasets (e.g.,

health & medicine, robustness, us government, (19 more...)

2010.151

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Providers & Services (0.89)
Health & Medicine > Pharmaceuticals & Biotechnology (0.74)
Health & Medicine > Government Relations & Public Policy (0.74)
Government > Regional Government > North America Government > United States Government > FDA (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.46)

arXiv.org Artificial IntelligenceMay-28-2019

The Hierarchy of Stable Distributions and Operators to Trade Off Stability and Performance

Subbaswamy, Adarsh, Chen, Bryant, Saria, Suchi

Recent work addressing model reliability and generalization has resulted in a variety of methods that seek to proactively address differences between the training and unknown target environments. While most methods achieve this by finding distributions that will be invariant across environments, we will show they do not necessarily find the same distributions which has implications for performance. In this paper we unify existing work on prediction using stable distributions by relating environmental shifts to edges in the graph underlying a prediction problem, and characterize stable distributions as those which effectively remove these edges. We then quantify the effect of edge deletion on performance in the linear case and corroborate the findings in a simulated and real data experiment.

artificial intelligence, health & medicine, stable distribution, (18 more...)

1905.11374

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceApr-15-2019

Tutorial: Safe and Reliable Machine Learning

Saria, Suchi, Subbaswamy, Adarsh

This document serves as a brief overview of the "Safe and Reliable Machine Learning" tutorial given at the 2019 ACM Conference on Fairness, Accountability, and Transparency (FAT* 2019). The talk slides can be found here: https://bit.ly/2Gfsukp, while a video of the talk is available here: https://youtu.be/FGLOCkC4KmE, and a complete list of references for the tutorial here: https://bit.ly/2GdLPme.

health & medicine, reliability, survey article, (17 more...)

1904.07204

Country: North America > United States (0.31)

Genre: Instructional Material (0.68)

Industry: Health & Medicine (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceDec-11-2018

Learning Predictive Models That Transport

Subbaswamy, Adarsh, Schulam, Peter, Saria, Suchi

Classical supervised learning produces unreliable models when training and target distributions differ, with most existing solutions requiring samples from the target domain. We propose a proactive approach which learns a relationship in the training domain that will generalize to the target domain by incorporating prior knowledge of aspects of the data generating process that are expected to differ as expressed in a causal selection diagram. Specifically, we remove variables generated by unstable mechanisms from the joint factorization to yield the Graph Surgery Estimator---an interventional distribution that is invariant to the differences across domains. We prove that the surgery estimator finds stable relationships in strictly more scenarios than previous approaches which only consider conditional relationships, and demonstrate this in simulated experiments. We also evaluate on real world data for which the true causal diagram is unknown, performing competitively against entirely data-driven approaches.

health & medicine, inductive learning, interventional distribution, (17 more...)

1812.04597

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment (0.93)
Health & Medicine > Therapeutic Area (0.47)
Media > Television (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)

arXiv.org Machine LearningAug-9-2018

Counterfactual Normalization: Proactively Addressing Dataset Shift and Improving Reliability Using Causal Mechanisms

Subbaswamy, Adarsh, Saria, Suchi

Predictive models can fail to generalize from training to deployment environments because of dataset shift, posing a threat to model reliability and the safety of downstream decisions made in practice. Instead of using samples from the target distribution to reactively correct dataset shift, we use graphical knowledge of the causal mechanisms relating variables in a prediction problem to proactively remove relationships that do not generalize across environments, even when these relationships may depend on unobserved variables (violations of the "no unobserved confounders" assumption). To accomplish this, we identify variables with unstable paths of statistical influence and remove them from the model. We also augment the causal graph with latent counterfactual variables that isolate unstable paths of statistical influence, allowing us to retain stable paths that would otherwise be removed. Our experiments demonstrate that models that remove vulnerable variables and use estimates of the latent variables transfer better, often outperforming in the target domain despite some accuracy loss in the training domain.

health & medicine, modeling & simulation, unstable path, (19 more...)

1808.03253

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Modeling & Simulation (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

arXiv.org Artificial IntelligenceNov-4-2017

Treatment-Response Models for Counterfactual Reasoning with Continuous-time, Continuous-valued Interventions

Soleimani, Hossein, Subbaswamy, Adarsh, Saria, Suchi

Treatment effects can be estimated from observational data as the difference in potential outcomes. In this paper, we address the challenge of estimating the potential outcome when treatment-dose levels can vary continuously over time. Further, the outcome variable may not be measured at a regular frequency. Our proposed solution represents the treatment response curves using linear time-invariant dynamical systems---this provides a flexible means for modeling response over time to highly variable dose curves. Moreover, for multivariate data, the proposed method: uncovers shared structure in treatment response and the baseline across multiple markers; and, flexibly models challenging correlation structure both across and within signals over time. For this, we build upon the framework of multiple-output Gaussian Processes. On simulated and a challenging clinical dataset, we show significant gains in accuracy over state-of-the-art models.

deep learning, nephrology, treatment response curve, (21 more...)

1704.02038

Country: North America > United States (0.28)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (0.71)
Health & Medicine > Health Care Technology > Medical Record (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.68)
Information Technology > Modeling & Simulation (0.67)