CNN {2}: Viewpoint Generalization via a Binocular Vision

Oct-11-2024, 01:44:59 GMT–Neural Information Processing Systems

The Convolutional Neural Networks (CNNs) have laid the foundation for many techniques in various applications. Despite achieving remarkable performance in some tasks, the 3D viewpoint generalizability of CNNs is still far behind humans visual capabilities. Although recent efforts, such as the Capsule Networks, have been made to address this issue, these new models are either hard to train and/or incompatible with existing CNN-based techniques specialized for different applications. Observing that humans use binocular vision to understand the world, we study in this paper whether the 3D viewpoint generalizability of CNNs can be achieved via a binocular vision. We propose CNN {2}, a CNN that takes two images as input, which resembles the process of an object being viewed from the left eye and the right eye.

cnn, viewpoint generalizability, viewpoint generalization, (3 more...)

Neural Information Processing Systems

Oct-11-2024, 01:44:59 GMT

Conferences Web Page

Add feedback

Industry:
- Media > Television (0.80)
- Leisure & Entertainment (0.80)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)