SRNN: Spatiotemporal Relational Neural Network for Intuitive Physics Understanding
–arXiv.org Artificial Intelligence
Human prowess in intuitive physics remains unmatched by machines. To bridge this gap, we argue for a fundamental shift towards brain-inspired computational principles. This paper introduces the Spatiotemporal Relational Neural Network (SRNN), a model that establishes a unified neural representation for object attributes, relations, and timeline, with computations governed by a Hebbian ``Fire Together, Wire Together'' mechanism across dedicated \textit{What} and \textit{How} pathways. This unified representation is directly used to generate structured linguistic descriptions of the visual scene, bridging perception and language within a shared neural substrate. On the CLEVRER benchmark, SRNN achieves competitive performance, thereby confirming its capability to represent essential spatiotemporal relations from the visual stream. Cognitive ablation analysis further reveals a benchmark bias, outlining a path for a more holistic evaluation. Finally, the white-box nature of SRNN enables precise pinpointing of error root causes. Our work provides a proof-of-concept that confirms the viability of translating key principles of biological intelligence into engineered systems for intuitive physics understanding in constrained environments.
arXiv.org Artificial Intelligence
Nov-20-2025
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > United Kingdom
- England > Oxfordshire > Oxford (0.04)
- Asia > Japan
- Genre:
- Research Report (0.64)
- Industry:
- Health & Medicine > Therapeutic Area > Neurology (0.68)
- Technology:
- Information Technology > Artificial Intelligence
- Cognitive Science (1.00)
- Machine Learning > Neural Networks
- Deep Learning (0.68)
- Natural Language > Large Language Model (0.97)
- Representation & Reasoning (1.00)
- Vision (1.00)
- Information Technology > Artificial Intelligence