融合深度学习与稠密光流的动态视觉SLAM

Dynamic Visual SLAM Integrating Deep Learning and Dense Optical Flow

  • 摘要: 针对传统视觉SLAM(simultaneous localization and mapping)算法在环境目标静止或低速运动状态工作良好,但在场景中存在人员走动、车辆运动等动态干扰时精度不高、鲁棒性不强的问题,提出了基于ORB-SLAM3(Oriented FAST and Rotated BRIEF-SLAM3)框架的动态SLAM系统,在ORB-SLAM3框架中融合了YOLACT + +(You Only Look At CoefficienTs)深度学习性;提出了运动等级传递策略,将实例分割网络和稠密光流场融合,达到SLAM系统效率与精度的联合优化。在公开数据集TUM上的测试结果表明,所提系统在动态场景下具有优异的性能,低动态场景下的均方根误差、平均值、中值和标准差等指标相比ORB-SLAM3提高了约60%,高动态场景下超90%。楼道场景的实测结果表明,所提系统在提取特征时能够有效剔除动态目标上的特征点,保证了系统的精度。

     

    Abstract: Traditional visual simultaneous localization and mapping (SLAM) algorithms function well when the environmental objects are stationary or moving at low speeds, but their precision and robustness are low when dynamic disturbances such as personnel walking and vehicle moving are present. To address this problem, a dynamic SLAM system is proposed based on the Oriented FAST and Rotated BRIEF-SLAM3 (ORB-SLAM3) framework, which integrates the You Only Look At CoefficienTs (YOLACT + +) deep learning network with the ORB-SLAM3 framework for detecting dynamic targets. A dense optical flow field is extracted and incorporated with visual geometry to discover the motion attributes. A motion-level transfer strategy that integrates an instance segmentation network and dense optical flow field to achieve joint optimization of SLAM system efficiency and accuracy is proposed. The test results on the public dataset TUM present that the proposed system has an outstanding performance in dynamic scenarios. Compared with ORB-SLAM3, the root mean square error, mean error, median error, and standard deviation in low dynamic scenarios are enhanced by approximately 60% and over 90% in high dynamic scenarios. The actual experiments in a corridor scene reveal that the proposed system can effectively eliminate feature points on dynamic targets while extracting features, thus guaranteeing system accuracy.

     

/

返回文章
返回