LongProLIP: A Probabilistic Vision-Language Model with Long Context Text

Open in new window