FENG Chao, LI Zhibiao, ZHANG Lei, ZHANG Guoyong, CHEN Jin, TANG Fei. Large Language Model Assisted Multi-modal Joint Relocalization Method[J]. INFORMATION AND CONTROL. DOI: 10.13976/j.cnki.xk.2025.1731
Citation: FENG Chao, LI Zhibiao, ZHANG Lei, ZHANG Guoyong, CHEN Jin, TANG Fei. Large Language Model Assisted Multi-modal Joint Relocalization Method[J]. INFORMATION AND CONTROL. DOI: 10.13976/j.cnki.xk.2025.1731

Large Language Model Assisted Multi-modal Joint Relocalization Method

  • A large language model-enhanced multimodal fusion localization method is proposed to address the challenge of mobile robot relocation failure caused by seasonal variations and scene structural changes. A laser-visual collaborative perception framework is constructed to eliminate the dependence of traditional relocation systems on environmental stability. A semantically-guided phased relocation mechanism is innovatively designed by embedding a multimodal large language model into the localization decision loop. During the coarse localization phase, DINOv2-based visual global descriptors are integrated with universal scene text semantic fingerprints (e.g., building features) parsed by the large language model to achieve cross-modal candidate pose retrieval. In the fine localization phase, a point cloud registration algorithm constrained by planar and linear features is employed to effectively suppress dynamic object interference. Complex transformation scenarios commonly found in industrial parks are simulated, and a benchmark dataset containing seasonal variations and spatial dynamic transformations is constructed. Comparative experiments with traditional algorithms are conducted on both public datasets and self-recorded datasets. Experimental results demonstrate that the relocation accuracy of the system under normal illumination conditions remains above 84.5%, confirming the system's stability and robustness in complex and dynamic scenarios.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return