Volumetric Correspondence Networks for Optical Flow

Oct-10-2024, 20:21:14 GMT–Neural Information Processing Systems

Many classic tasks in vision -- such as the estimation of optical flow or stereo disparities -- can be cast as dense correspondence matching. Well-known techniques for doing so make use of a cost volume, typically a 4D tensor of match costs between all pixels in a 2D image and their potential matches in a 2D search window. However, such layers require significant amounts of memory and compute, making them cumbersome to use in practice. As a result, SOTA networks also employ various heuristics designed to limit volumetric processing, leading to limited accuracy and overfitting. Instead, we introduce several simple modifications that dramatically simplify the use of volumetric layers - (1) volumetric encoder-decoder architectures that efficiently capture large receptive fields, (2) multi-channel cost volumes that capture multi-dimensional notions of pixel similarities, and finally, (3) separable volumetric filtering that significantly reduces computation and parameters while preserving accuracy. Our innovations dramatically improve accuracy over SOTA on standard benchmarks while being significantly easier to work with - training converges in 10X fewer iterations, and most importantly, our networks generalize across correspondence tasks.

make use, optical flow, volumetric correspondence network, (2 more...)

Neural Information Processing Systems

Oct-10-2024, 20:21:14 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Vision (0.68)
  - Machine Learning (0.41)