I2DFormer: Learning Image to Document Attention for Zero-Shot Image Classification