Goto

Collaborating Authors

 causal perspective


Invariant Anomaly Detection under Distribution Shifts: A Causal Perspective

Neural Information Processing Systems

Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples by solely relying on the consistency of the normal training samples. Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down. In this work, by leveraging tools from causal inference we attempt to increase the resilience of anomaly detection models to different kinds of distribution shifts. We begin by elucidating a simple yet necessary statistical property that ensures invariant representations, which is critical for robust AD under both domain and covariate shifts. From this property, we derive a regularization term which, when minimized, leads to partial distribution invariance across environments. Through extensive experimental evaluation on both synthetic and real-world tasks, covering a range of six different AD methods, we demonstrated significant improvements in out-of-distribution performance. Under both covariate and domain shift, models regularized with our proposed term showed marked increased robustness.


Preference Learning for AI Alignment: a Causal Perspective

Kobalczyk, Katarzyna, van der Schaar, Mihaela

arXiv.org Machine Learning

Reward modelling from preference data is a crucial step in aligning large language models (LLMs) with human values, requiring robust generalisation to novel prompt-response pairs. In this work, we propose to frame this problem in a causal paradigm, providing the rich toolbox of causality to identify the persistent challenges, such as causal misidentification, preference heterogeneity, and confounding due to user-specific factors. Inheriting from the literature of causal inference, we identify key assumptions necessary for reliable generalisation and contrast them with common data collection practices. We illustrate failure modes of naive reward models and demonstrate how causally-inspired approaches can improve model robustness. Finally, we outline desiderata for future research and practices, advocating targeted interventions to address inherent limitations of observational data.


Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making

Neural Information Processing Systems

As society increasingly relies on AI-based tools for decision-making in socially sensitive domains, investigating fairness and equity of such automated systems has become a critical field of inquiry. Most of the literature in fair machine learning focuses on defining and achieving fairness criteria in the context of prediction, while not explicitly focusing on how these predictions may be used later on in the pipeline. For instance, if commonly used criteria, such as independence or sufficiency, are satisfied for a prediction score S used for binary classification, they need not be satisfied after an application of a simple thresholding operation on S (as commonly used in practice). In this paper, we take an important step to address this issue in numerous statistical and causal notions of fairness. We introduce the notion of a margin complement, which measures how much a prediction score S changes due to a thresholding operation.We then demonstrate that the marginal difference in the optimal 0/1 predictor \widehat Y between groups, written P(\hat y \mid x_1) - P(\hat y \mid x_0), can be causally decomposed into the influences of X on the L_2 -optimal prediction score S and the influences of X on the margin complement M, along different causal pathways (direct, indirect, spurious).


Rethinking Misalignment in Vision-Language Model Adaptation from a Causal Perspective

Neural Information Processing Systems

Foundational Vision-Language models such as CLIP have exhibited impressive generalization in downstream tasks. However, CLIP suffers from a two-level misalignment issue, i.e., task misalignment and data misalignment, when adapting to specific tasks. Soft prompt tuning has mitigated the task misalignment, yet the data misalignment remains a challenge. To analyze the impacts of the data misalignment, we revisit the pre-training and adaptation processes of CLIP and develop a structural causal model. We discover that while we expect to capture task-relevant information for downstream tasks accurately, the task-irrelevant knowledge impacts the prediction results and hampers the modeling of the true relationships between the images and the predicted classes.


Towards Robust Trajectory Representations: Isolating Environmental Confounders with Causal Learning

Luo, Kang, Zhu, Yuanshao, Chen, Wei, Wang, Kun, Zhou, Zhengyang, Ruan, Sijie, Liang, Yuxuan

arXiv.org Artificial Intelligence

Trajectory modeling refers to characterizing human movement behavior, serving as a pivotal step in understanding mobility patterns. Nevertheless, existing studies typically ignore the confounding effects of geospatial context, leading to the acquisition of spurious correlations and limited generalization capabilities. To bridge this gap, we initially formulate a Structural Causal Model (SCM) to decipher the trajectory representation learning process from a causal perspective. Building upon the SCM, we further present a Trajectory modeling framework (TrajCL) based on Causal Learning, which leverages the backdoor adjustment theory as an intervention tool to eliminate the spurious correlations between geospatial context and trajectories. Extensive experiments on two real-world datasets verify that TrajCL markedly enhances performance in trajectory classification tasks while showcasing superior generalization and interpretability.


A Causal Perspective on Loan Pricing: Investigating the Impacts of Selection Bias on Identifying Bid-Response Functions

Bockel-Rickermann, Christopher, Verboven, Sam, Verdonck, Tim, Verbeke, Wouter

arXiv.org Artificial Intelligence

In lending, where prices are specific to both customers and products, having a well-functioning personalized pricing policy in place is essential to effective business making. Typically, such a policy must be derived from observational data, which introduces several challenges. While the problem of ``endogeneity'' is prominently studied in the established pricing literature, the problem of selection bias (or, more precisely, bid selection bias) is not. We take a step towards understanding the effects of selection bias by posing pricing as a problem of causal inference. Specifically, we consider the reaction of a customer to price a treatment effect. In our experiments, we simulate varying levels of selection bias on a semi-synthetic dataset on mortgage loan applications in Belgium. We investigate the potential of parametric and nonparametric methods for the identification of individual bid-response functions. Our results illustrate how conventional methods such as logistic regression and neural networks suffer adversely from selection bias. In contrast, we implement state-of-the-art methods from causal machine learning and show their capability to overcome selection bias in pricing data.


No Fair Lunch: A Causal Perspective on Dataset Bias in Machine Learning for Medical Imaging

Jones, Charles, Castro, Daniel C., Ribeiro, Fabio De Sousa, Oktay, Ozan, McCradden, Melissa, Glocker, Ben

arXiv.org Artificial Intelligence

As machine learning methods gain prominence within clinical decision-making, addressing fairness concerns becomes increasingly urgent. Despite considerable work dedicated to detecting and ameliorating algorithmic bias, today's methods are deficient with potentially harmful consequences. Our causal perspective sheds new light on algorithmic bias, highlighting how different sources of dataset bias may appear indistinguishable yet require substantially different mitigation strategies. We introduce three families of causal bias mechanisms stemming from disparities in prevalence, presentation, and annotation. Our causal analysis underscores how current mitigation methods tackle only a narrow and often unrealistic subset of scenarios. We provide a practical three-step framework for reasoning about fairness in medical imaging, supporting the development of safe and equitable AI prediction models.


Causal Disentanglement with Network Information for Debiased Recommendations

Sheth, Paras, Guo, Ruocheng, Cheng, Lu, Liu, Huan, Candan, K. Selçuk

arXiv.org Machine Learning

Recommender systems aim to recommend new items to users by learning user and item representations. In practice, these representations are highly entangled as they consist of information about multiple factors, including user's interests, item attributes along with confounding factors such as user conformity, and item popularity. Considering these entangled representations for inferring user preference may lead to biased recommendations (e.g., when the recommender model recommends popular items even if they do not align with the user's interests). Recent research proposes to debias by modeling a recommender system from a causal perspective. The exposure and the ratings are analogous to the treatment and the outcome in the causal inference framework, respectively. The critical challenge in this setting is accounting for the hidden confounders. These confounders are unobserved, making it hard to measure them. On the other hand, since these confounders affect both the exposure and the ratings, it is essential to account for them in generating debiased recommendations. To better approximate hidden confounders, we propose to leverage network information (i.e., user-social and user-item networks), which are shown to influence how users discover and interact with an item. Aside from the user conformity, aspects of confounding such as item popularity present in the network information is also captured in our method with the aid of \textit{causal disentanglement} which unravels the learned representations into independent factors that are responsible for (a) modeling the exposure of an item to the user, (b) predicting the ratings, and (c) controlling the hidden confounders. Experiments on real-world datasets validate the effectiveness of the proposed model for debiasing recommender systems.


On Pitfalls of Identifiability in Unsupervised Learning. A Note on: "Desiderata for Representation Learning: A Causal Perspective"

Ghosh, Shubhangi, Gresele, Luigi, von Kügelgen, Julius, Besserve, Michel, Schölkopf, Bernhard

arXiv.org Machine Learning

Model identifiability is a desirable property in the context of unsupervised representation learning. In absence thereof, different models may be observationally indistinguishable while yielding representations that are nontrivially related to one another, thus making the recovery of a ground truth generative model fundamentally impossible, as often shown through suitably constructed counterexamples. In this note, we discuss one such construction, illustrating a potential failure case of an identifiability result presented in "Desiderata for Representation Learning: A Causal Perspective" by Wang & Jordan (2021). The construction is based on the theory of nonlinear independent component analysis. We comment on implications of this and other counterexamples for identifiable representation learning.


UC Berkeley Uses a Causal Perspective to Formalise the Desiderata for Representation Learning

#artificialintelligence

Representation learning is used to summarize essential features of high-dimensional data and turn them into lower-dimensional representations with desirable properties. A popular method for this is the heuristic approach, which fits a neural network that maps from the high dimensional data to a set of labels, taking the top layer of the neural network as the representation of the inputs. However, such heuristic approaches often end up capturing spurious features that do not transfer well; or finding entangled dimensions that are uninterpretable. And while non-spuriousness or disentanglement are natural desiderata of representations, they are difficult to evaluate and optimize over algorithmically. To address this issue, a new study by UC Berkeley researchers Yixian Wang and Michael I. Jordon takes a causal perspective on representation learning, which enables the formalization of non-spuriousness, efficiency and disentanglement representation learning desiderata using causal notions.