Causal Balancing for Domain Generalization

Wang, Xinyi, Saxon, Michael, Li, Jiachen, Zhang, Hongyang, Zhang, Kun, Wang, William Yang

Feb-19-2023–arXiv.org Artificial Intelligence

While machine learning models rapidly advance the state-of-the-art on various real-world tasks, out-of-domain (OOD) generalization remains a challenging problem given the vulnerability of these models to spurious correlations. We propose a balanced mini-batch sampling strategy to transform a biased data distribution into a spurious-free balanced distribution, based on the invariance of the underlying causal mechanisms for the data generation process. We argue that the Bayes optimal classifiers trained on such balanced distribution are minimax optimal across a diverse enough environment space. We also provide an identifiability guarantee of the latent variable model of the proposed data generation process, when utilizing enough train environments. Experiments are conducted on DomainBed, demonstrating empirically that our method obtains the best performance across 20 baselines reported on the benchmark.

artificial intelligence, generalization, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Feb-19-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Pennsylvania > Allegheny County
      - Pittsburgh (0.04)
    - California > Santa Barbara County
      - Santa Barbara (0.04)
  - Canada > Ontario
    - Waterloo Region > Waterloo (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East
  - UAE (0.04)

Genre:
- Research Report (1.00)

Industry:
- Health & Medicine (0.92)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (0.46)
  - Machine Learning
    - Neural Networks (0.93)
    - Learning Graphical Models (0.87)
    - Statistical Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found