Target-Driven Navigation in Low-Light Environments Based on Enhanced Perception Transformer
-
Graphical Abstract
-
Abstract
To address the challenges of perceptual degradation, low data efficiency, and real-time performance issues encountered by mobile robots during Target-Driven Visual Navigation (TDN) in unknown, low-light environments, this paper proposes an integrated navigation framework based on an Enhanced Perception Transformer (EPT). Initially, an efficient low-light image enhancement technique, comprising an Enhancement Factor Extraction (EFE) network and a Recurrent Image Enhancement (RIE) process, is utilized to improve the quality of raw visual inputs. Subsequently, the EPT encoder deeply fuses the enhanced visual information with the target state, represented in relative polar coordinates, generating a goal-oriented scene representation by means of goal tokens and a multi-head self-attention mechanism. Based on this representation, a Soft Actor-Critic (SAC) algorithm is employed for navigation decision-making. To ensure real-time capability, the framework integrates performance optimization strategies, including input downsampling and Just-In-Time (JIT) compilation. Extensive simulations in Gazebo demonstrate that the optimized EPT-SAC framework achieves high frame rates and low latency, meeting the real-time requirements for mobile robot navigation. It outperforms conventional baselines in both navigation success rate and learning efficiency, achieving average success rates of 61.2% and 82.0% in laboratory and warehouse environments, respectively. It effectively enhances the recognition of obstacles and target locations in low-light visual target-driven navigation tasks.
-
-