Reviews: Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation
–Neural Information Processing Systems
An approach for joint estimation of 3D Layout, 3D Object Detection, Camera Pose Estimation and Holistic Scene Understanding' (as defined in Song et al. (2015)) is proposed. More specifically, deep nets, functional mappings (e.g., projections from 3D to 2D points) and loss functions are combined to obtain a holistic interpretation of a scene illustrated in a single RGB image. The proposed approach is shown to outperform 3DGP (Choi et al. (2013)) and IM2CAD (Izadinia et al. (2017)) on the SUN RGB-D dataset. Review Summary: The paper is well written and presents an intuitive approach which is illustrated to work well when compared to two baselines. For some of the tasks, e.g., 3D Layout estimation, stronger baselines exist and as a reviewer/reader I can't assess how the proposed approach compares.
Neural Information Processing Systems
Oct-7-2024, 16:02:27 GMT