AITopics | Silva, Ricardo

Collaborating Authors

Silva, Ricardo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

Li, Kaican, Xie, Weiyan, Huang, Yongxiang, Deng, Didan, Hong, Lanqing, Li, Zhenguo, Silva, Ricardo, Zhang, Nevin L.

arXiv.org Artificial IntelligenceNov-29-2024

Fine-tuning foundation models often compromises their robustness to distribution shifts. To remedy this, most robust fine-tuning methods aim to preserve the pre-trained features. However, not all pre-trained features are robust and those methods are largely indifferent to which ones to preserve. We propose dual risk minimization (DRM), which combines empirical risk minimization with worst-case risk minimization, to better preserve the core features of downstream tasks. In particular, we utilize core-feature descriptions generated by LLMs to induce core-based zero-shot predictions which then serve as proxies to estimate the worst-case risk. DRM balances two crucial aspects of model robustness: expected performance and worst-case performance, establishing a new state of the art on various real-world benchmarks. DRM significantly improves the out-of-distribution performance of CLIP ViT-L/14@336 on ImageNet (75.9 to 77.1), WILDS-iWildCam (47.1 to 51.8), and WILDS-FMoW (50.7 to 53.1); opening up new avenues for robust fine-tuning. Our code is available at https://github.com/vaynexie/DRM .

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.19757

Country: North America (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback

Fine-Tuning Pre-trained Language Models for Robust Causal Representation Learning

Yu, Jialin, Zhou, Yuxiang, He, Yulan, Zhang, Nevin L., Silva, Ricardo

arXiv.org Artificial IntelligenceOct-18-2024

The fine-tuning of pre-trained language models (PLMs) has been shown to be effective across various domains. By using domain-specific supervised data, the general-purpose representation derived from PLMs can be transformed into a domain-specific representation. However, these methods often fail to generalize to out-of-domain (OOD) data due to their reliance on non-causal representations, often described as spurious features. Existing methods either make use of adjustments with strong assumptions about lack of hidden common causes, or mitigate the effect of spurious features using multi-domain data. In this work, we investigate how fine-tuned pre-trained language models aid generalizability from single-domain scenarios under mild assumptions, targeting more general and practical real-world scenarios. We show that a robust representation can be derived through a so-called causal front-door adjustment, based on a decomposition assumption, using fine-tuned representations as a source of data augmentation. Comprehensive experiments in both synthetic and real-world settings demonstrate the superior generalizability of the proposed method compared to existing approaches. Our work thus sheds light on the domain generalization problem by introducing links between fine-tuning and causal mechanisms into representation learning.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.14375

Country:

Asia (0.28)
Europe > United Kingdom > England (0.14)
North America > United States > Louisiana (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Structured Learning of Compositional Sequential Interventions

Yu, Jialin, Koukorinis, Andreas, Colombo, Nicolò, Zhu, Yuchen, Silva, Ricardo

arXiv.org Machine LearningJun-9-2024

We consider sequential treatment regimes where each unit is exposed to combinations of interventions over time. When interventions are described by qualitative labels, such as ``close schools for a month due to a pandemic'' or ``promote this podcast to this user during this week'', it is unclear which appropriate structural assumptions allow us to generalize behavioral predictions to previously unseen combinatorial sequences. Standard black-box approaches mapping sequences of categorical variables to outputs are applicable, but they rely on poorly understood assumptions on how reliable generalization can be obtained, and may underperform under sparse sequences, temporal variability, and large action spaces. To approach that, we pose an explicit model for \emph{composition}, that is, how the effect of sequential interventions can be isolated into modules, clarifying which data conditions allow for the identification of their combined effect at different units and time steps. We show the identification properties of our compositional model, inspired by advances in causal matrix factorization methods but focusing on predictive models for novel compositions of interventions instead of matrix completion tasks and causal effect estimation. We compare our approach to flexible but generic black-box models to illustrate how structure aids prediction in sparse data conditions.

artificial intelligence, intervention, machine learning, (18 more...)

arXiv.org Machine Learning

2406.05745

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Bounding Causal Effects with Leaky Instruments

Watson, David S., Penn, Jordan, Gunderson, Lee M., Bravo-Hermsdorff, Gecia, Mastouri, Afsaneh, Silva, Ricardo

arXiv.org Artificial IntelligenceMay-8-2024

Instrumental variables (IVs) are a popular and powerful tool for estimating causal effects in the presence of unobserved confounding. However, classical approaches rely on strong assumptions such as the $\textit{exclusion criterion}$, which states that instrumental effects must be entirely mediated by treatments. This assumption often fails in practice. When IV methods are improperly applied to data that do not meet the exclusion criterion, estimated causal effects may be badly biased. In this work, we propose a novel solution that provides $\textit{partial}$ identification in linear systems given a set of $\textit{leaky instruments}$, which are allowed to violate the exclusion criterion to some limited degree. We derive a convex optimization objective that provides provably sharp bounds on the average treatment effect under some common forms of information leakage, and implement inference procedures to quantify the uncertainty of resulting estimates. We demonstrate our method in a set of experiments with simulated data, where it performs favorably against the state of the art. An accompanying $\texttt{R}$ package, $\texttt{leakyIV}$, is available from $\texttt{CRAN}$.

artificial intelligence, instrument, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.04446

Country: North America > United States > Massachusetts > Middlesex County (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Counterfactual Fairness Is Not Demographic Parity, and Other Observations

Silva, Ricardo

arXiv.org Artificial IntelligenceFeb-4-2024

This manuscript is motivated by [21], published at AAAI 2023. It is based on some misunderstandings about counterfactual fairness, leading to the incorrect conclusion that counterfactual fairness and demographic parity are equivalent. Emphatically, the goal of this manuscript is not to criticize any particular paper. Instead, it follows from the fact that none of the AAAI reviewers were able to help the authors of [21]. This suggests to me that will would be helpful to provide some notes on possible common missteps on understanding counterfactual fairness, using [21] just as a springboard for broader comments. In a nutshell, counterfactual fairness is a notion of individual fairness that lies on Rung 3 of Pearl's ladder of causality [19], while demographic parity is a non-causal notion of group fairness occurring at the purely probabilistic Rung 1. We illustrate that there are scenarios of strong causal assumptions where the two notions "coincide", in an inconsequential equivalence that relies on narrow Rung 3 conditions.

artificial intelligence, counterfactual fairness, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2402.02663

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Higher Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Intervention Generalization: A View from Factor Graph Models

Bravo-Hermsdorff, Gecia, Watson, David S., Yu, Jialin, Zeitler, Jakob, Silva, Ricardo

arXiv.org Machine LearningNov-8-2023

One of the goals of causal inference is to generalize from past experiments and observational data to novel conditions. While it is in principle possible to eventually learn a mapping from a novel experimental condition to an outcome of interest, provided a sufficient variety of experiments is available in the training data, coping with a large combinatorial space of possible interventions is hard. Under a typical sparse experimental design, this mapping is ill-posed without relying on heavy regularization or prior distributions. Such assumptions may or may not be reliable, and can be hard to defend or test. In this paper, we take a close look at how to warrant a leap from past experiments to novel conditions based on minimal assumptions about the factorization of the distribution of the manipulated system, communicated in the well-understood language of factor graph models. A postulated $\textit{interventional factor model}$ (IFM) may not always be informative, but it conveniently abstracts away a need for explicitly modeling unmeasured confounding and feedback mechanisms, leading to directly testable claims. Given an IFM and datasets from a collection of experimental regimes, we derive conditions for identifiability of the expected outcomes of new regimes never observed in these training data. We implement our framework using several efficient algorithms, and apply them on a range of semi-synthetic experiments.

artificial intelligence, data mining, machine learning, (16 more...)

arXiv.org Machine Learning

2306.04027

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
(2 more...)

Add feedback

Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases

Lynch, Aengus, Dovonon, Gbètondji J-S, Kaddour, Jean, Silva, Ricardo

arXiv.org Artificial IntelligenceJun-12-2023

The problem of spurious correlations (SCs) arises when a classifier relies on non-predictive features that happen to be correlated with the labels in the training data. For example, a classifier may misclassify dog breeds based on the background of dog images. This happens when the backgrounds are correlated with other breeds in the training data, leading to misclassifications during test time. Previous SC benchmark datasets suffer from varying issues, e.g., over-saturation or only containing one-to-one (O2O) SCs, but no many-to-many (M2M) SCs arising between groups of spurious attributes and classes. In this paper, we present \benchmark-\{O2O, M2M\}-\{Easy, Medium, Hard\}, an image classification benchmark suite containing spurious correlations between classes and backgrounds. To create this dataset, we employ a text-to-image model to generate photo-realistic images and an image captioning model to filter out unsuitable ones. The resulting dataset is of high quality and contains approximately 152k images. Our experimental results demonstrate that state-of-the-art group robustness methods struggle with \benchmark, most notably on the Hard-splits with none of them getting over $70\%$ accuracy on the hardest split using a ResNet50 pretrained on ImageNet. By examining model misclassifications, we detect reliances on spurious backgrounds, demonstrating that our dataset provides a significant challenge.

artificial intelligence, machine learning, spurious correlation, (12 more...)

arXiv.org Artificial Intelligence

2303.0547

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Stochastic Causal Programming for Bounding Treatment Effects

Padh, Kirtan, Zeitler, Jakob, Watson, David, Kusner, Matt, Silva, Ricardo, Kilbertus, Niki

arXiv.org Artificial IntelligenceMay-17-2023

Causal effect estimation is important for many tasks in the natural and social sciences. We design algorithms for the continuous partial identification problem: bounding the effects of multivariate, continuous treatments when unmeasured confounding makes identification impossible. Specifically, we cast causal effects as objective functions within a constrained optimization problem, and minimize/maximize these functions to obtain bounds. We combine flexible learning algorithms with Monte Carlo methods to implement a family of solutions under the name of stochastic causal programming. In particular, we show how the generic framework can be efficiently formulated in settings where auxiliary variables are clustered into pre-treatment and post-treatment sets, where no fine-grained causal graph can be easily specified. In these settings, we can avoid the need for fully specifying the distribution family of hidden common causes. Monte Carlo computation is also much simplified, leading to algorithms which are more computationally stable against alternatives.

artificial intelligence, optimization problem, stochastic causal programming, (1 more...)

arXiv.org Artificial Intelligence

2202.10806

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)

Add feedback

Proximal Causal Learning with Kernels: Two-Stage Estimation and Moment Restriction

Mastouri, Afsaneh, Zhu, Yuchen, Gultchin, Limor, Korba, Anna, Silva, Ricardo, Kusner, Matt J., Gretton, Arthur, Muandet, Krikamol

arXiv.org Artificial IntelligenceMar-27-2023

We address the problem of causal effect estimation in the presence of unobserved confounding, but where proxies for the latent confounder(s) are observed. We propose two kernel-based methods for nonlinear causal effect estimation in this setting: (a) a two-stage regression approach, and (b) a maximum moment restriction approach. We focus on the proximal causal learning setting, but our methods can be used to solve a wider class of inverse problems characterised by a Fredholm integral equation. In particular, we provide a unifying view of two-stage and moment restriction approaches for solving this problem in a nonlinear setting. We provide consistency guarantees for each algorithm, and we demonstrate these approaches achieve competitive results on synthetic data and data simulating a real-world task. In particular, our approach outperforms earlier methods that are not suited to leveraging proxy variables.

artificial intelligence, machine learning, proximal causal learning, (15 more...)

arXiv.org Artificial Intelligence

2105.04544

Country: Europe > United Kingdom > England (0.27)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.93)
Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Pragmatic Fairness: Developing Policies with Outcome Disparity Control

Gultchin, Limor, Guo, Siyuan, Malek, Alan, Chiappa, Silvia, Silva, Ricardo

arXiv.org Artificial IntelligenceJan-28-2023

We introduce a causal framework for designing optimal policies that satisfy fairness constraints. We take a pragmatic approach asking what we can do with an action space available to us and only with access to historical data. We propose two different fairness constraints: a moderation breaking constraint which aims at blocking moderation paths from the action and sensitive attribute to the outcome, and by that at reducing disparity in outcome levels as much as the provided action space permits; and an equal benefit constraint which aims at distributing gain from the new and maximized policy equally across sensitive attribute levels, and thus at keeping pre-existing preferential treatment in place or avoiding the introduction of new disparity. We introduce practical methods for implementing the constraints and illustrate their uses on experiments with semi-synthetic models.

artificial intelligence, constraint, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2301.12278

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.28)

Genre: Research Report (0.82)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback