Goto

Collaborating Authors

 nice-slam


Towards Open World NeRF-Based SLAM

Lisus, Daniil, Holmes, Connor, Waslander, Steven

arXiv.org Artificial Intelligence

Neural Radiance Fields (NeRFs) offer versatility and robustness in map representations for Simultaneous Localization and Mapping (SLAM) tasks. This paper extends NICE-SLAM, a recent state-of-the-art NeRF-based SLAM algorithm capable of producing high quality NeRF maps. However, depending on the hardware used, the required number of iterations to produce these maps often makes NICE-SLAM run at less than real time. Additionally, the estimated trajectories fail to be competitive with classical SLAM approaches. Finally, NICE-SLAM requires a grid covering the considered environment to be defined prior to runtime, making it difficult to extend into previously unseen scenes. This paper seeks to make NICE-SLAM more open-world-capable by improving the robustness and tracking accuracy, and generalizing the map representation to handle unconstrained environments. This is done by improving measurement uncertainty handling, incorporating motion information, and modelling the map as having an explicit foreground and background. It is shown that these changes are able to improve tracking accuracy by 85% to 97% depending on the available resources, while also improving mapping in environments with visual information extending outside of the predefined grid.


Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation

Yang, Xingrui, Li, Hai, Zhai, Hongjia, Ming, Yuhang, Liu, Yuqian, Zhang, Guofeng

arXiv.org Artificial Intelligence

In this work, we present a dense tracking and mapping system named Vox-Fusion, which seamlessly fuses neural implicit representations with traditional volumetric fusion methods. Our approach is inspired by the recently developed implicit mapping and positioning system and further extends the idea so that it can be freely applied to practical scenarios. Specifically, we leverage a voxel-based neural implicit surface representation to encode and optimize the scene inside each voxel. Furthermore, we adopt an octree-based structure to divide the scene and support dynamic expansion, enabling our system to track and map arbitrary scenes without knowing the environment like in previous works. Moreover, we proposed a high-performance multi-process framework to speed up the method, thus supporting some applications that require real-time performance. The evaluation results show that our methods can achieve better accuracy and completeness than previous methods. We also show that our Vox-Fusion can be used in augmented reality and virtual reality applications. Our source code is publicly available at https://github.com/zju3dv/Vox-Fusion.


Dense RGB SLAM with Neural Implicit Maps

Li, Heng, Gu, Xiaodong, Yuan, Weihao, Yang, Luwei, Dong, Zilong, Tan, Ping

arXiv.org Artificial Intelligence

There is an emerging trend of using neural implicit functions for map representation in Simultaneous Localization and Mapping (SLAM). Some pioneer works have achieved encouraging results on RGB-D SLAM. In this paper, we present a dense RGB SLAM method with neural implicit map representation. To reach this challenging goal without depth input, we introduce a hierarchical feature volume to facilitate the implicit map decoder. This design effectively fuses shape cues across different scales to facilitate map reconstruction. Our method simultaneously solves the camera motion and the neural implicit map by matching the rendered and input video frames. To facilitate optimization, we further propose a photometric warping loss in the spirit of multi-view stereo to better constrain the camera pose and scene geometry. We evaluate our method on commonly used benchmarks and compare it with modern RGB and RGB-D SLAM systems. Our method achieves favorable results than previous methods and even surpasses some recent RGB-D SLAM methods.The code is at poptree.github.io/DIM-SLAM/.


Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping

Chung, Chi-Ming, Tseng, Yang-Che, Hsu, Ya-Ching, Shi, Xiang-Qian, Hua, Yun-Hung, Yeh, Jia-Fong, Chen, Wen-Chin, Chen, Yi-Ting, Hsu, Winston H.

arXiv.org Artificial Intelligence

A spatial AI that can perform complex tasks through visual signals and cooperate with humans is highly anticipated. To achieve this, we need a visual SLAM that easily adapts to new scenes without pre-training and generates dense maps for downstream tasks in real-time. None of the previous learning-based and non-learning-based visual SLAMs satisfy all needs due to the intrinsic limitations of their components. In this work, we develop a visual SLAM named Orbeez-SLAM, which successfully collaborates with implicit neural representation and visual odometry to achieve our goals. Moreover, Orbeez-SLAM can work with the monocular camera since it only needs RGB inputs, making it widely applicable to the real world. Results show that our SLAM is up to 800x faster than the strong baseline with superior rendering outcomes. Code link: https://github.com/MarvinChung/Orbeez-SLAM.