Visual Localization Method of Autonomous Underwater Vehicle Based on Synthetic Data
JU Ling1,2,3, ZHOU Xingqun2,3, HU Zhiqiang2,3, YANG Yi2,3, LI Liming2,3,4, BAI Shihong1
1. School of Mechanical Engineering, Shenyang Ligong University, Shenyang 110159, China; 2. State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China; 3. Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, China; 4. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:The autonomous underwater vehicle (AUV) pose dataset is difficult to obtain in underwater scenarios. In addition, the existing deep learning-based pose estimation methods cannot be applied in this scenario. Thus, this paper proposes an AUV visual localization method based on synthetic data. In this method, we first build a virtual underwater scene by Unity3D and obtain the rendering data of the known pose through the virtual camera. Then, we realize the style transfer of the rendered image to the real underwater scene through the unpaired image translation work. We also obtain the synthetic underwater pose dataset by combining the pose information of the known rendered image. Finally, we propose a convolutional neural network (CNN) pose estimation method based on local region keypoint projections. The CNN is trained using synthetic data to predict 2D projections of known reference corners. The resulting 2D-3D point pairs obtain the relative positions and pose through the Perspective-n-Point algorithm that is based on random sample consensus. The effectiveness of the proposed method is examined using quantitative experiments on rendered datasets and synthetic datasets, as well as qualitative experiments on real underwater scenes. Our experimental results show that the unpaired image translation can effectively eliminate the gap between the rendered image and the real underwater image. We also find that the proposed local area keypoint projection method can perform more effective 6D pose estimation.
琚玲, 周星群, 胡志强, 杨翊, 李黎明, 白士红. 基于合成数据的水下机器人视觉定位方法[J]. 信息与控制, 2023, 52(2): 129-141.
JU Ling, ZHOU Xingqun, HU Zhiqiang, YANG Yi, LI Liming, BAI Shihong. Visual Localization Method of Autonomous Underwater Vehicle Based on Synthetic Data. Information and control, 2023, 52(2): 129-141.
[1] 杨翊, 周星群, 胡志强, 等. 基于视觉定位的水下机器人无通信高精度编队技术研究[J]. 数字海洋与水下攻防, 2022, 5(1):50-58. YANG Y, ZHOU X Q, HU Z Q, et al. Research on high-precision unmanned underwater vehicles team formation without communication based on visual positioning technology[J]. Digital Ocean and Underwater Warfare, 2022, 5(1):50-58. [2] WU Y, TA X, XIAO R, et al. Survey of underwater robot positioning navigation[J/OL]. Applied Ocean Research, 2019[2021-03-16]. http://www.sciencedirect.com/science/article/abs/pii/s014118718305546. DOI:10.1016/j.apor.2019.06.002. [3] GONZÁLEZ-GARCÍA J, GÓMEZ-ESPINOSA A, CUAN-URQUIZO E, et al. Autonomous underwater vehicles:Localization, navigation, and communication for collaborative missions[J/OL]. Applied sciences, 2020[2021-07-04]. http://www.innovation4.cn/library/r56182. DOI:10.3390/app10041256. [4] WATSON S, DUECKER D A, GROVES K. Localisation of unmanned underwater vehicles (UUVs) in complex and confined environments:A review[J/OL]. Sensors, 2020[2022-01-02]. https://www.mdpi.com/1424-8220/20/21/6203. DOI:10.3390/s20216203. [5] 王丹, 张子玉, 赵金宝, 等. 基于场景深度估计的自然光照水下图像增强方法[J]. 机器人, 2021, 43(3):364-372. WANG D, ZHANG Z Y, ZHAO J B, et al. An enhancement method for undwerwater images under natural illumination based on scene depth estimation[J]. Robotics, 2021, 43(3):364-372. [6] HINTERSTOISSER S, LEPETIT V, ILIC S, et al. Model based training, detection and pose estimation of texture-less 3d objects in heavily cluttered scenes[C]//Asian Conference on Computer Vision. Berlin, Germnay:Springer:548-562. [7] KRULL A, BRACHMANN E, MICHEL F, et al. Learning analysis-by-synthesis for 6D pose estimation in RGB-D images[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2015:954-962. [8] XIANG Y, SCHMIDT T, NARAYANAN V, et al. PoseCNN:A convolutional neural network for 6d object pose estimation in cluttered scenes[EB/OL]. (2018-05-26)[2021-02-21]. https://arxiv.org/abs/1711.00199. [9] SCHNEIDERMAN H, NASHMAN M, WAVERING A J, et al. Vision-based robotic convoy driving[J]. Machine Vision and Applications, 1995, 8(6):359-364. [10] 王然. 基于领航-跟随模型的水下机器人编队研究[D]. 哈尔滨:哈尔滨工业大学, 2020. WANG R. Research on formation of underwater robot based on leader-follower model[D]. Harbin:Harbin Institute of Technology, 2020. [11] LEPETIT V, MORENO-NOGUER F, FUA P. Epnp:An accurate o (n) solution to the pnp problem[J]. International Journal of Computer Vision, 2009, 81(2):155-166. [12] ZHANG H, CAO Q. Detect in RGB, optimize in edge:Accurate 6D pose estimation for texture-less industrial parts[C]//2019 International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2019:3486-3492. [13] KEHL W, MILLETARI F, TOMBARI F, et al. Deep learning of local rgb-d patches for 3d object detection and 6d pose estimation[C]//European Conference on Computer Vision. Berlin, Germany:Springer, 2016:205-220. [14] HU Y, HUGONOT J, FUA P, et al. Segmentation-driven 6D object pose estimation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2019:3385-3394. [15] MUÑOZ E, KONISHI Y, BELTRAN C, et al. Fast 6D pose from a single RGB image using Cascaded Forests Templates[C]//2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway, USA:IEEE, 2016:4062-4069. [16] TEJANI A, KOUSKOURIDAS R, DOUMANOGLOU A, et al. Latent-class hough forests for 6 DoF object pose estimation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(1):119-132. [17] PENG S, LIU Y, HUANG Q, et al. PVNet:Pixel-wise voting network for 6dof pose estimation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2019:4561-4570. [18] RAD M, LEPETIT V. Bb8:A scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2017:3828-3836. [19] ZAKHAROV S, SHUGUROV I, ILIC S. DPOD:6D pose object detector and refiner[C]//IEEE/CVF International Conference on Computer Vision. Piscataway, USA:IEEE, 2019:1941-1950. [20] GUPTA K, PETERSSON L, HARTLEY R. Cullnet:Calibrated and pose aware confidence scores for object pose estimation[C]//IEEE/CVF International Conference on Computer Vision Workshops. Piscataway, USA:IEEE, 2019:2758-2766. [21] LI Y, WANG G, JI X, et al. DeepIM:Deep iterative matching for 6d pose estimation[C]//European Conference on Computer Vision Berlin, Germany:Springer, 2018:683-698. [22] TEKIN B, SINHA S N, FUA P. Real-time seamless single shot 6D object pose prediction[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA:2018:292-301. [23] REDMON J, FARHADI A. YOLO9000:Better, faster, stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2017:7263-7271. [24] LIU P, ZHANG Q, ZHANG J, et al. MFPN-6D:Real-time one-stage pose estimation of objects on RGB images[C]//2021 IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE:2021:12939-12945. [25] TAN M, PANG R, LE Q V. Efficientdet:Scalable and efficient object detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2020:10781-10790. [26] WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet:A new backbone that can enhance learning capability of CNN[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway, USA:IEEE, 2020:390-391. [27] CHENG J, LIU P, ZHANG Q, et al. Real-time and efficient 6-D pose estimation from a single RGB image[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70(1):1-14. [28] ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2017:2223-2232. [29] REDMON J, FARHADI A. Yolov3:An incremental improvement[EB/OL]. (2018-04-08)[2021-05-23]. https://arxiv.org/abs/1804.02767. [30] HOHNSON J, ALAHI A, LI F F. Perceptual losses for real-time style transfer and super-resolution[C]//European Conference on Computer Vision. Berlin, Germany:Springer:2016:694-711. [31] ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks[C]//IEEE Conference on Computer Vision And Pattern Recognition. Piscataway, USA:IEEE, 2017:1125-1134. [32] BRACHMANN E, MICHEL F, KRULL A, et al. Uncertainty-driven 6D pose estimation of objects and scenes from a single rgb image[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2016:3364-3372. [33] KEHL W, MANHARDT F, TOMBARI F, et al. SSD-6D:Making rgb-based 3D detection and 6D pose estimation great again[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2017:1521-1529.