FLAIR: VLM with Fine-grained Language-informed Image Representations