复杂环境下基于强化学习与灰狼优化的多无人机路径规划算法

常绪成; 朱锋; 张心慧; 党帅龙; 王敬宇; 胡繁

doi:10.13976/j.cnki.xk.2025.3482

复杂环境下基于强化学习与灰狼优化的多无人机路径规划算法

Multi-UAV Path Planning Algorithm Based on Reinforcement Learning and Grey Wolf Optimization in Complex Environments

摘要

摘要: 针对复杂环境下传统多无人机路径规划算法存在的收敛速度慢、易陷入局部最优与适应性不足等问题，提出了一种基于灰狼优化（GWO）算法与强化学习的多无人机3维路径规划策略。为了提高复杂环境下传统GWO算法的路径规划综合性能，通过设计动态非线性收敛因子增强算法前期探索能力与后期精度，引入莱维飞行（Lévy flight）策略提升跳出局部最优的概率，优化位置更新策略平衡全局搜索与局部开发效率。为进一步提高算法在复杂环境下的适应性，将深度Q网络（DQN）与改进的灰狼优化算法深度融合，并对DQN的状态空间与奖励函数进行优化设计。在不同场景下，对本文提出的DQN-GWO算法进行对比仿真实验。实验结果表明，与传统优化算法相比，DQN-GWO算法在保持运行时间相近的同时，其收敛路径适应度平均提升了4%～6%，路径长度降低了2%～8%，最大转向角降低了30%～60%，显著提高了复杂环境下规划路径的质量与鲁棒性，为多无人机在复杂场景中的协同路径规划提供了一种有效的智能优化解决方案。

Abstract: To address the issues of slow convergence, susceptibility to local optima, and insufficient adaptability in traditional multi-UAV path planning algorithms in complex environments, we propose a multi-UAV 3D path planning strategy based on the grey wolf optimization (GWO) and reinforcement learning. To improve the comprehensive path planning performance of the traditional GWO algorithm in complex environments, we design a dynamic nonlinear convergence factor to enhance the algorithm’s early-stage exploration ability and later-stage accuracy, introduce a Lévy flight strategy to increase the probability of escaping local optimality, and optimize a position update strategy to balance the efficiency of global search and local exploitation. To further improve the algorithm’s adaptability in complex environments, we integrate the deep Q-network (DQN) into the improved GWO algorithm, and optimize the state space and reward function of DQN. Comparative simulation experiments of the proposed DQN-GWO algorithm are conducted in different scenarios. The results show that compared with traditional optimization algorithms, the DQN-GWO algorithm maintains comparable runtime while achieving an average 4%～6% improvement in path fitness, a 2%～8% reduction in path length, and a 30%～60% decrease in maximum steering angle. This significantly enhances the quality and robustness of planned paths in complex environments, providing an effective intelligent optimization solution for collaborative path planning among multiple UAVs in complex environments.

HTML全文

参考文献(26)

施引文献

资源附件(0)