Multi-modalSituated Reasoningin3DScenes