Large-Scale Gaussian Splatting SLAM
Zhe Xin,
Chenyang Wu,
Penghui Huang,
Yanyong Zhang,
Yinian Mao,
Guoquan Huang
Under Review
Abstract
The recently developed Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) have shown encouraging and impressive results for visual SLAM. However, most representative methods require RGBD sensors and are only available for indoor environments. The robustness of reconstruction in large-scale outdoor scenarios remains unexplored. This paper introduces a large-scale 3DGS-based visual SLAM with stereo cameras, termed LSG-SLAM. The proposed LSG-SLAM employs a multi-modality strategy to estimate prior poses under large view changes. In tracking, we introduce feature-alignment warping constraints to alleviate the adverse effects of appearance similarity in rendering losses. For the scalability of large-scale scenarios, we introduce continuous Gaussian Splatting submaps to tackle unbounded scenes with limited memory. Loops are detected between GS submaps by place recognition and the relative pose between looped keyframes is optimized utilizing rendering and feature warping losses. After the global optimization of camera poses and Gaussian points, a structure refinement module enhances the reconstruction quality. With extensive evaluations on the EuRoc and KITTI datasets, LSG-SLAM achieves superior performance over existing Neural, 3DGS-based, and even traditional approaches.
Overview
The overview of LSG-SLAM. For each incoming frame, we employ the multi-modality prior pose estimation strategy and subsequently utilize rendering loss and feature warping loss to optimize the current pose. Selected keyframes are then employed to refine the scene and add more Gaussian points. Incorporating loop detection and loop constraint estimation utilizing GS submaps, all keyframe poses are optimized. Subsequently, all Gaussian points are adjusted based on the relative transformation of their associated keyframes. After optimizing the pose of all images, a structure refinement process is executed to enhance the reconstruction quality for novel view synthesis.
Robustness of LSG-SLAM
Results on KITTI sequence 00. LSG-SLAM enables precise camera tracking and high-fidelity reconstruction in challenging large-scale scenarios. We showcase the trajectory error before and after loop closure and some sample keyframes with associated submaps. The PSNR is evaluated after the structure refinement.
Experimental Results
Camera tracking results on EuRoC (ATE RMSE [m]).
Rendering results on EuRoC.
Camera tracking results on KITTI (ATE RMSE [m]).
Rendering results on KITTI.
Acknowledgement
The website template was borrowed from Michaƫl Gharbi and Ben Mildenhall.