fc8ee7c7ab5b5f6b1615045dfb617ed6-Paper-Conference.pdf
–Neural Information Processing Systems
Indoor environments are the primary setting where humans spend most of their daily lives. Yet, computationally creating digital twins of these 3D spaces from captured images remains challenging. Factors such as the difficulty of accurate camera pose estimation from indoor images [28, 11, 1] and structural distortions in the resulting 3D reconstructions [22, 12, 21] hinder the development of robust, accurate, and user-friendly solutions for replicating indoor scenes in the digital world. As indoor scenes are typically rich in planar structures such as floors, ceilings, and walls, as well as planar furniture like tables and cabinets, planar primitives are well-suited representations for the accurate 3D reconstruction of indoor scenes. As a result, there has been significant interest among the research community in planar 3D reconstruction in recent years. Planar reconstruction approaches include feedforward solutions in monocular [40, 16, 27, 24, 18, 42] and two-view [11, 1, 28] settings, and per-scene optimization approaches [29, 38, 3, 9] that leverage posed multi-view inputs with the assistance of the feedforward models were studied. However, these approaches face two key limitations: Annotation dependence for feedforward methods: Learning feedforward models [36, 24, 28] typically requires accurate plane masks and 3D plane annotations from monocular or binocular inputs.
Neural Information Processing Systems
Jun-23-2026, 04:09:48 GMT
- Genre:
- Research Report > Experimental Study (1.00)
- Technology:
- Information Technology > Artificial Intelligence
- Vision (1.00)
- Machine Learning (1.00)
- Representation & Reasoning > Optimization (0.48)
- Natural Language > Large Language Model (0.47)
- Information Technology > Artificial Intelligence