Multi-distribution or collaborative learning involves learning a single predictor that works well across multiple data distributions, using samples from each during training.
This makes it easily extensible and much more expressive than existing toolkits. It supports all 9 and all 10 of the decision-based group metrics of two popular review articles.
Sharpness-A ware Minimization (SAM) is a recently proposed gradient-based optimizer (Foret et al., ICLR 2021) that greatly improves the prediction performance of deep neural networks.