Fully Distributed, Flexible Compositional Visual Representations via Soft Tensor Products

Neural Information Processing Systems 

Since the inception of the classicalist vs. connectionist debate, it has been argued that the ability to systematically combine symbol-like entities into compositional representations is crucial for human intelligence. In connectionist systems, the field of disentanglement has gained prominence for its ability to produce explicitly compositional representations; however, it relies on a fundamentally representation of compositional structure that clashes with the foundations of deep learning. To resolve this tension, we extend Smolensky's Tensor Product Representation (TPR) and introduce, a representational form that encodes compositional structure in an inherently manner, along with, a theoretically-principled architecture designed specifically to learn Soft TPRs. Comprehensive evaluations in the visual representation learning domain demonstrate that the Soft TPR framework consistently outperforms conventional disentanglement alternatives -- achieving state-of-the-art disentanglement, boosting representation learner convergence, and delivering superior sample efficiency and low-sample regime performance in downstream tasks.