动态环境下的语义SLAM算法

Semantic SLAM Algorithm Based on Dynamic Environment

  • 摘要: 传统的同时定位与地图构建(SLAM)算法存在着易受动态物体影响和无法提取场景语义信息的问题.为了解决上述问题,提出了一种动态环境下的室外3维语义地图的构建方法.首先在语义分割方面,提出一种基于全卷积网络(fully convolutional network,FCN)和超像素的条件随机场(conditional random field,CRF)对图像进行语义分割,并结合语义信息和对极约束剔除动态物体上的特征点.然后,利用视觉里程计估计相机的运动轨迹,并利用单目深度估计算法获取深度数据,通过深度数据获取场景的3维模型.同时,通过贝叶斯渐进式标签迁移算法将二维语义标签逐步转移到3维点云中,在此基础上,提出一种基于高阶CRF的全局3维地图优化算法,该算法通过时空一致的3维超像素建立CRF的高阶项,增加点云与所属3维区域之间的约束关系来实现语义分割中点云所属类别的边界一致性.经实验验证,本算法在动态场景下较主流算法有更好的位姿估计精度,能够有效地提高单帧图像的语义分割精度并获得全局一致的语义地图.

     

    Abstract: Traditional algorithms for simultaneous localization and mapping, referred to as SLAM algorithms, are easily affected by dynamic objects and cannot extract semantic scene information. To solve these problems, we propose an algorithm for building outdoor three-dimensional (3D) semantic maps in dynamic environments. Firstly, with respect to semantic segmentation, we propose a conditional random field (CRF) image semantic segmentation algorithm based on fully convolutional networks and superpixels. This algorithm combines semantic information with epipolar constraints to remove the feature points of dynamic object. Then, the camera trajectory is estimated by the visual odometer, depth data is obtained by the monocular depth estimation algorithm, and a 3D model is obtained based on the depth data. At the same time, two-dimensional semantic labels are gradually mapped to the 3D point cloud by a Bayesian progressive label migration algorithm. On this basis, we propose a global 3D map optimization algorithm based on a high-order CRF. This algorithm sets up the high-order terms of the CRF based on time-space consistency of 3D superpixels, and adds the constraint relationship between the point cloud and the 3D region to achieve boundary consistency of the point-cloud category during semantic segmentation. Experimental results show that this algorithm obtains more accurate pose estimation than the mainstream algorithm in dynamic scenes, and can effectively improve the accuracy of the semantic segmentation of single-frame images and obtain a globally consistent semantic map.

     

/

返回文章
返回