Abstract:
The autonomous underwater vehicle (AUV) pose dataset is difficult to obtain in underwater scenarios. In addition, the existing deep learning-based pose estimation methods cannot be applied in this scenario. Thus, this paper proposes an AUV visual localization method based on synthetic data. In this method, we first build a virtual underwater scene by Unity3D and obtain the rendering data of the known pose through the virtual camera. Then, we realize the style transfer of the rendered image to the real underwater scene through the unpaired image translation work. We also obtain the synthetic underwater pose dataset by combining the pose information of the known rendered image. Finally, we propose a convolutional neural network (CNN) pose estimation method based on local region keypoint projections. The CNN is trained using synthetic data to predict 2D projections of known reference corners. The resulting 2D-3D point pairs obtain the relative positions and pose through the Perspective-n-Point algorithm that is based on random sample consensus. The effectiveness of the proposed method is examined using quantitative experiments on rendered datasets and synthetic datasets, as well as qualitative experiments on real underwater scenes. Our experimental results show that the unpaired image translation can effectively eliminate the gap between the rendered image and the real underwater image. We also find that the proposed local area keypoint projection method can perform more effective 6D pose estimation.