group dro 0
- North America > United States > District of Columbia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Hawaii (0.04)
- (4 more...)
- Research Report > Experimental Study (0.94)
- Research Report > New Finding (0.93)
- Questionnaire & Opinion Survey (0.93)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Data Science (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- North America > United States > District of Columbia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > North Carolina (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (0.67)
- Law (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Public Health (1.00)
- (11 more...)
- North America > United States > District of Columbia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Hawaii (0.04)
- (4 more...)
- Research Report > Experimental Study (0.94)
- Research Report > New Finding (0.93)
- Questionnaire & Opinion Survey (0.93)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
- Information Technology > Data Science (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- North America > United States > District of Columbia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > North Carolina (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (0.67)
- Law (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Public Health (1.00)
- (11 more...)
Benchmarking Distribution Shift in Tabular Data with TableShift
Gardner, Josh, Popovic, Zoran, Schmidt, Ludwig
Robustness to distribution shift has become a growing concern for text and image models as they transition from research subjects to deployment in the real world. However, high-quality benchmarks for distribution shift in tabular machine learning tasks are still lacking despite the widespread real-world use of tabular data and differences in the models used for tabular data in comparison to text and images. As a consequence, the robustness of tabular models to distribution shift is poorly understood. To address this issue, we introduce TableShift, a distribution shift benchmark for tabular data. TableShift contains 15 binary classification tasks in total, each with an associated shift, and includes a diverse set of data sources, prediction targets, and distribution shifts. The benchmark covers domains including finance, education, public policy, healthcare, and civic participation, and is accessible using only a few lines of Python code via the TableShift API. We conduct a large-scale study comparing several state-of-the-art tabular data models alongside robust learning and domain generalization methods on the benchmark tasks. Our study demonstrates (1) a linear trend between in-distribution (ID) and out-of-distribution (OOD) accuracy; (2) domain robustness methods can reduce shift gaps but at the cost of reduced ID accuracy; (3) a strong relationship between shift gap (difference between ID and OOD performance) and shifts in the label distribution. The benchmark data, Python package, model implementations, and more information about TableShift are available at https://github.com/mlfoundations/tableshift and https://tableshift.org .
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > North Carolina (0.04)
- North America > United States > Hawaii (0.04)
- (8 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- Overview (0.87)
- Law (1.00)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
- Health & Medicine > Public Health (1.00)
- (12 more...)