E-MoFlow: Learning Egomotion and Optical Flow from Event Data via Implicit Regularization

Neural Information Processing Systems 

The estimation of optical flow and 6-DoF ego-motion--two fundamental tasks in 3-D vision--has typically been addressed independently. For neuromorphic vision (e.g., event cameras), however, the lack of robust data association makes solving the two problems separately an ill-posed challenge, especially in the absence of supervision via ground truth. Existing works mitigate this ill-posedness by either enforcing the smoothness of the flow field via an explicit variational regularizer or leveraging explicit structure-and-motion priors in the parametrization to improve event alignment. The former notably introduces bias in results and computational overhead, while the latter--which parametrizes the optical flow in terms of the scene depth and the camera motion--often converges to suboptimal local minima. To address these issues, we propose an unsupervised pipeline that jointly optimizes egomotion and flow via implicit spatial-temporal and geometric regularization.