The Neglected Tails of Vision-Language Models