A General Protocol to Probe Large Vision Models for 3D Physical Understanding
–Neural Information Processing Systems
Our objective in this paper is to probe large vision models to determine to what extent they'understand' different physical properties of the 3D scene depicted in an image. To this end, we make the following contributions: (i) We introduce a general and lightweight protocol to evaluate whether features of an off-the-shelf large vision model encode a number of physical'properties' of the 3D scene, by training discriminative classifiers on the features for these properties. The probes are applied on datasets of real images with annotations for the property.
Neural Information Processing Systems
Dec-25-2025, 17:26:17 GMT
- Technology: