In this blog, I will give a brief introduction of Semantic GS SLAM papers recently. The first one is accepted by ECCV2024, and the rest two are accepted by ACM MM 2024.
SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM
- The first semantic visual SLAM system based on 3DGS
- A integration of rendering, scene understanding and object-level geomerty
- Compared with Neural implicit semantic SLAM like DNS-SLAM and SNI-SLAM
- Insight I: Using explicit spatial and semantic information to identify
scene content can be instrumental in optimizing camera tracking
- RGB-D SLAM
- Multi-channel Optimization
Method
- The multi-channel Gaussian representation contains semantic color (3-channel as LangSplat)
- Tracking loss:
- Assuming constant velocity $E_{t+1} = E_t + (E_t - E_{t-1})$
- Photometric loss + depth loss + semantic loss
- Keyframe selection and weighting:
- Geometric-based selection: Randomly select pixels of current frame and their Gaussians. These Gaussians are projected onto camera views of keyframes. The more the projected Gaussians, the lower the weights and should be removed.
- Semantic-based selection: Remove the ones with high mIoU
- Mapping Loss: Depth loss and SSIM loss for color and semantic image

SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAM
- 16-channel Semantic feature embedding
- semantic-informed bundle adjustment
- Feature loss and semantic loss
- Results better than SGSLAM slightly
- Semantic-informed Bundle Adjustment:
- Segmentation + RGB + Depth