基于语义先验和深度约束的室内动态场景RGB-D SLAM算法

RGB-D SLAM Algorithm Based on Semantic Priors and Depth Constraints for Dynamic Indoor Environment

  • 摘要: 针对大多数同时定位与地图构建(simultaneous localization and mapping,SLAM)系统在动态场景下位姿估计不准确的问题,本文提出了一个基于语义先验的加权极线和深度约束的运动一致性检测算法,以此构建一个室内动态场景下的视觉SLAM系统.该系统首先对输入图像进行语义分割,获取潜在运动特征点集合;其次对图像非潜在运动区域进行特征点提取,获取帧间变换的初值,利用加权的极线约束和深度约束完成对潜在外点(如运动特征点)的二次判断,并将外点移除从而更新静态特征点集合.最后利用静态特征点集实现对相机位姿的精确求解,并作为位姿优化的初值送入后端.本文在TUM(慕尼黑工业大学)数据集上的9个动态场景序列以及波恩复杂动态环境数据集的3个图像序列上进行了多次对比测试,其绝对轨迹误差(ATE)的均方根误差(RMSE)与现有先进的动态SLAM系统DS-SLAM相比降低了10.53%~93.75%,对于平移和旋转相对位姿误差(RPE),RMSE指标最高实现73.44%和68.73%的下降.结果表明,改进的方法能够显著降低动态环境下的位姿估计误差.

     

    Abstract: To address the problem of the inaccurate pose estimations by most simultaneous localization and map construction (SLAM) systems in dynamic scenes, in this paper, a motion consistency detection algorithm is proposed that utilizes weighted epipolar and depth constraints based on semantic priors. This algorithm can construct a visual SLAM system in a dynamic indoor scene. In this method, a semantic segmentation thread is applied to the input image to obtain the set of potential motion feature points. Then, feature points on the non-latent motion area of the image are extracted to obtain an initial value of inter-frame transformation. In addition, weighted epipolar and depth constraints are uesd to determine the potential outliers (i.e., dynamic feature points) and update the static feature point set by the removal of outlies. Lastly, the set of robust static feature points is used to exactly determine the pose of the camera, which is sent to the back end as the initial motion optimization value.We introduced the proposed algorithm into the visual SLAM system, and evaluated its performance on nine dynamic scene sequences of the TUM dataset and three image sequences of the BONN complex dynamic environment dataset. The RMSE (root mean squared error) of the absolute trajectory error is reduced by 10.53% to 93.75% compared to that of the state-of-the-art dynamic SLAM system DS-SLAM, and the RMSE for the translation and rotation relative pose error achieved reductions of up to 73.44% and 68.73%, respectively. Experimental results show that the proposed method significantly reduces the motion estimation error in dynamic environments.

     

/

返回文章
返回