Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling

Open in new window