Text Descriptions are Compressive and Invariant Representations for Visual Learning