Abstract:
To address the issues of slow convergence, susceptibility to local optima, and insufficient adaptability in traditional multi-UAV path planning algorithms in complex environments, we propose a multi-UAV 3D path planning strategy based on the grey wolf optimization (GWO) and reinforcement learning. To improve the comprehensive path planning performance of the traditional GWO algorithm in complex environments, we design a dynamic nonlinear convergence factor to enhance the algorithm’s early-stage exploration ability and later-stage accuracy, introduce a Lévy flight strategy to increase the probability of escaping local optimality, and optimize a position update strategy to balance the efficiency of global search and local exploitation. To further improve the algorithm’s adaptability in complex environments, we integrate the deep Q-network (DQN) into the improved GWO algorithm, and optimize the state space and reward function of DQN. Comparative simulation experiments of the proposed DQN-GWO algorithm are conducted in different scenarios. The results show that compared with traditional optimization algorithms, the DQN-GWO algorithm maintains comparable runtime while achieving an average 4%~6% improvement in path fitness, a 2%~8% reduction in path length, and a 30%~60% decrease in maximum steering angle. This significantly enhances the quality and robustness of planned paths in complex environments, providing an effective intelligent optimization solution for collaborative path planning among multiple UAVs in complex environments.