Canonical Correlation Inference for Mapping Abstract Scenes to Text