Grounded Semantic Composition for Visual Scenes