The multi-modal universe of fast-fashion: the Visuelle 2.0 benchmark