Improving generalisation via anchor multivariate analysis

Durand, Homer, Varando, Gherardo, Mankovich, Nathan, Camps-Valls, Gustau

Mar-11-2024–arXiv.org Machine Learning

Data sources in contemporary machine learning applications are often heterogeneous, leading to potential distribution shifts Sugiyama and Kawanabe [2012], Shen et al. [2021]. This is a particularly relevant problem in computer vision Csurka [2017], healthcare Zhang et al. [2021], finance, Earth and climate sciences [Tuia et al., 2016, Kellenberger et al., 2021] and social sciences, as variations in data patterns can significantly impact model performance and generalisation in the out-of-distribution (OOD) setting, also referred to as domain generalisation [Shen et al., 2021, Zhou et al., 2023]. Various frameworks have been proposed to formally address the emergence of distribution shifts during the testing phase Peters et al. [2016], Arjovsky et al. [2020]. In cases where the data distribution is entailed by a Structural Causal Model (SCM) Peters et al. [2017], one can consider distribution shifts arising from intervention on specific variables of the SCM. Notably, the Instrumental Variable (IV) regression exhibits robustness to arbitrarily strong interventions [Bowden and Turkington, 1990]. However, pursuing algorithms robust to strong interventions may be overly conservative, especially when prior knowledge is available regarding the intervention strength that generates the distribution shift. Anchor Regression (AR) addresses this challenge by explicitly considering interventions on exogenous variables up to a specified strength [Rothenhäusler et al., 2018].

algorithm, regularisation, robustness, (16 more...)

arXiv.org Machine Learning

Mar-11-2024

arXiv.org PDF

Add feedback

Country:
- South America (0.04)
- Africa > Central Africa (0.04)
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - California > San Francisco County
    - San Francisco (0.14)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report > New Finding (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning (1.00)
  - Machine Learning > Statistical Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found