Goto

Collaborating Authors

 shift factor


Rethinking Robustness in Machine Learning: A Posterior Agreement Approach

arXiv.org Artificial Intelligence

The robustness of algorithms against covariate shifts is a fundamental problem with critical implications for the deployment of machine learning algorithms in the real world. Current evaluation methods predominantly match the robustness definition to that of standard generalization, relying on standard metrics like accuracy-based scores, which, while designed for performance assessment, lack a theoretical foundation encompassing their application in estimating robustness to distribution shifts. In this work, we set the desiderata for a robustness metric, and we propose a novel principled framework for the robustness assessment problem that directly follows the Posterior Agreement (PA) theory of model validation. Specifically, we extend the PA framework to the covariate shift setting by proposing a PA metric for robustness evaluation in supervised classification tasks. We assess the soundness of our metric in controlled environments and through an empirical robustness analysis in two different covariate shift scenarios: adversarial learning and domain generalization. We illustrate the suitability of PA by evaluating several models under different nature and magnitudes of shift, and proportion of affected observations. The results show that the PA metric provides a sensible and consistent analysis of the vulnerabilities in learning algorithms, even in the presence of few perturbed observations. Equal contribution, authors listed in alphabetical order.


Bias Reduction via End-to-End Shift Learning: Application to Citizen Science

arXiv.org Artificial Intelligence

Citizen science projects are successful at gathering rich datasets for various applications. However, the data collected by citizen scientists are often biased --- in particular, aligned more with the citizens' preferences than with scientific objectives. We propose the Shift Compensation Network (SCN), an end-to-end learning scheme which learns the shift from the scientific objectives to the biased data while compensating for the shift by re-weighting the training data. Applied to bird observational data from the citizen science project eBird, we demonstrate how SCN quantifies the data distribution shift and outperforms supervised learning models that do not address the data bias. Compared with competing models in the context of covariate shift, we further demonstrate the advantage of SCN in both its effectiveness and its capability of handling massive high-dimensional data.