Object Scene Representation Transformer

Neural Information Processing Systems 

A compositional understanding of the world in terms of objects and their geometry in 3D space is considered a cornerstone of human cognition.