Citation: | XU Jia, HU Chunhe. Autonomous Collision Avoidance Method of UAV Based on Distributed Multi-experience Pool[J]. INFORMATION AND CONTROL, 2023, 52(4): 432-443. DOI: 10.13976/j.cnki.xk.2023.2188 |
In this study, we propose a distributed multi-experience pool collision avoidance method with a deep deterministic policy gradient (DMEP-DDPG) to meet the demand for efficient autonomous collision avoidance in cooperative tasks of multiple unmanned aerial vehicles (multi-UAVs). The proposed method is based on data-driven reinforcement learning methods and enables a single UAV to rely on its own sensor data for autonomous collision avoidance operations in a multi-UAV environment. For this, we first design a bootstrap reward function-based system payoff mechanism to address the sparse reward problem of reinforcement learning tasks with long periods. Since the low sample efficiency of a single experience pool results in difficult policy convergence, we construct a novel distributed multi-experience pool updated deterministic policy gradient framework to overcome this problem. Finally, we test the collision avoidance performance of the DMEP-DDPG method in multi-UAV collaborative missions and compare performance metrics with other learning-based collision avoidance strategies. Our experimental results verify the feasibility and effectiveness of the DMEP-DDPG method.
[1] |
戴健, 许菲, 陈琪锋. 多无人机协同搜索区域划分与路径规划[J]. 航空学报, 2020, 41(S1): 149-156. https://www.cnki.com.cn/Article/CJFDTOTAL-HKXB2020S1015.htm
DAI J, XU F, CHEN Q F. Multi-UAV cooperative search on region division and path planning[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(S1): 149-156. https://www.cnki.com.cn/Article/CJFDTOTAL-HKXB2020S1015.htm
|
[2] |
张世勇, 张雪波, 苑晶, 等. 旋翼无人机环境覆盖与探索规划方法综述[J]. 控制与决策, 2022, 37(3): 513-529. https://www.cnki.com.cn/Article/CJFDTOTAL-KZYC202203001.htm
ZHANG S Y, ZHANG X B, YUAN J, et al. A survey on coverage and exploration path planning with multi-rotor micro aerial vehicles[J]. Control and Decision, 2022, 37(3): 513-529. https://www.cnki.com.cn/Article/CJFDTOTAL-KZYC202203001.htm
|
[3] |
ZHE B. Research on UAV delivery route optimization based on improved adaptive genetic algorithm[J]. Frontiers in Economics and Management, 2021, 2(3): 290-296.
|
[4] |
SHIN J I, SEO W W, KIM T, et al. Using UAV multispectral images for classification of forest burn severity-A case study of the 2019 Gangneung forest fire[J]. Forests, 2019, 10(11)[2022-03-07]. http://www.mdpi.com/1999-4907/10/11/1025. DOI: 10.3390/f10111025.
|
[5] |
GAUTAM A, SINGH M, SUJIT P B, et al. Autonomous quadcopter landing on a moving target[J/OL]. Sensors. [2022-02-01]. https://www.mdpi.com/1424-8220/22/3/1116. DOI: 10.3390/s22031116.
|
[6] |
KWAK J, LEE S, BESK J, et al. Autonomous UAV target tracking and safe landing on a leveling mobile platform[J]. International Journal of Precision Engineering and Manufacturing, 2022, 23(3): 305-317. doi: 10.1007/s12541-021-00617-8
|
[7] |
LOQUERCIO A, KAUFMANN E, RANFTL R, et al. Deep drone racing: from simulation to reality with domain randomization[J/OL]. IEEE Transactions on Robotics, 2020, 36(1)[2022-07-11]. https://ieeexplore.ieee.org/document/8877728. DOI: 10.1109/TRO.2019.2942989.
|
[8] |
张祥银, 夏爽, 张天. 基于自适应遗传学习粒子群算法的多无人机协同任务分配[J/OL]. 控制与决策. [2022-07-14]. https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CAPJ&dbname=CAPJLAST&filename=KZYC20220710008&uniplatform=NZKPT&v=SFfl7Znyzm6IgiurKSQYhPOxcnkzQsUwm7fyhYOSAaKlIu7eXzghyeKgwjoc6FOs. DOI: 10.13195/j.kzyjc.2022.0240.
ZHANG X Y, XIA S, ZHANG T. Adaptive genetic learning particle swarm optimization based cooperative task allocation for multi-UAVs[J/OL]. Control and Decision. [2022-07-14]. https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=CAPJ&dbname=CAPJLAST&filename=KZYC20220710008&uniplatform=NZKPT&v=SFfl7Znyzm6IgiurKSQYhPOxcnkzQsUwm7fyhYOSAa-KlIu7eXzghyeKgwjoc6FOs. DOI: 10.13195/j.kzyjc.2022.0240.
|
[9] |
贾永楠, 田似营, 李擎. 无人机集群研究进展综述[J]. 航空学报, 2020, 41(S1): 4-14. https://www.cnki.com.cn/Article/CJFDTOTAL-HKXB2020S1001.htm
JIA Y N, TIAN S Y, Li Q. Development of unmanned aerial vehicle swarms[J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(S1): 4-14. https://www.cnki.com.cn/Article/CJFDTOTAL-HKXB2020S1001.htm
|
[10] |
AGGARWAL S, KUMAR N. Path planning techniques for unmanned aerial vehicles: A review, solutions, and challenges[J]. Computer communications, 2020, 149: 270-299. doi: 10.1016/j.comcom.2019.10.014
|
[11] |
YANG J C, ZHANG Z, MAO W, et al. Identification and micro-motion parameter estimation of non-cooperative UAV targets[J/OL]. Physical Communication: Conference Series. [2021-03-04]. https://www.sciencedirect.com/science/article/pii/S1874490721000513. DOI: 10.1016/j.phycom.2021.101314.
|
[12] |
PETRAS A, LING L, PIRET C M, et al. A least-squares implicit RBF-FD closest point method and applications to PDEs on moving surfaces[J]. Journal of Computational Physics, 2019, 381: 146-161. doi: 10.1016/j.jcp.2018.12.031
|
[13] |
HA L N N T, BUI D H P, HONG S K. Nonlinear control for autonomous trajectory tracking while considering collision avoidance of UAVs based on geometric relations[J/OL]. Energies. [2019-04-24]. https://www.mdpi.com/1996-1073/12/8/1551. DOI: 10.3390/en12081551.
|
[14] |
XU T, ZHANG S, JIANG Z, et al. Collision avoidance of high-speed obstacles for mobile robots via maximum-speed aware velocity obstacle method[J]. IEEE Access, 2020, 8: 138493-138507. doi: 10.1109/ACCESS.2020.3012513
|
[15] |
LIU X H, ZHANG D G, ZHANG T, et al. Novel best path selection approach based on hybrid improved A* algorithm and reinforcement learning[J]. Applied Intelligence, 2021, 51(12): 9015-9029. doi: 10.1007/s10489-021-02303-8
|
[16] |
REN X P, TAN L, JIA Q S, et al. Multi-target UAV path planning based on improved RRT algorithm[J/OL]. Journal of Physics: Conference Series[2020-12-25]. https://iopscience.iop.org/article/10.1088/1742-6596/1786/1/012038. DOI: 10.1088/1742-6596/1786/1/012038.
|
[17] |
YAN Y H, LYU Z Y, YUAN J B, et al. Obstacle avoidance for multi-UAV system with optimized artificial potential field algorithm[J/OL]. International Journal of Robotics & Automation, 2021, 36[2022-03-10]. http://oninelibrary.wiley.com/doi/10.2316/J.2021.2060610.
|
[18] |
HE H X, DUAN H B. A multi-strategy pigeon-inspired optimization approach to active disturbance rejection control parameters tuning for vertical take-off and landing fixed-wing UAV[J]. Chinese Journal of Aeronautics, 2022, 35(1): 19-30. doi: 10.1016/j.cja.2021.05.010
|
[19] |
YANG J, X H Y, WU G X, et al. Application of reinforcement learning in UAV cluster task scheduling[J]. Future Generation Computer Systems, 2019, 95: 140-148.
|
[20] |
RODRIGUEZ-RAMOS A, SAMPEDRO C, PUENTE H B, et al. A deep reinforcement learning strategy for UAV autonomous landing on a moving platform[J]. Journal of Intelligent & Robotic Systems: Theory & Applications, 2019, 93(1/2): 351-366.
|
[21] |
MATIGNON L, LAURENT G J, LE FORT-PIAT N, et al. Independent reinforcement learners in cooperative Markov games: A survey regarding coordination problems[J]. The Knowledge Engineering Review, 2012, 27(1): 1-31.
|
[22] |
ZHAO W W, CHU H R, MIAO X K, et al. Research on the multiagent joint proximal policy optimization algorithm controlling cooperative fixed-wing UAV obstacle avoidance[J]. Sensors, 2020, 20(16): 1-16.
|
[23] |
相晓嘉, 闫超, 王菖, 等. 基于深度强化学习的固定翼无人机编队协调控制方法[J]. 航空学报, 2021, 42(4): 420-433. https://www.cnki.com.cn/Article/CJFDTOTAL-HKXB202104030.htm
XIANG X J, YAN C, WANG C, et al. Coordination control method for fixed-wing UAV formation through deep reinforcement learning[J]. Acta Aeronautica et Astronautica Sinica, 2021, 42(4): 420-433. https://www.cnki.com.cn/Article/CJFDTOTAL-HKXB202104030.htm
|
[24] |
符小卫, 王辉, 徐哲. 基于DE-MADDPG的多无人机协同追捕策略[J]. 航空学报, 2022, 43(5): 530-543. https://www.cnki.com.cn/Article/CJFDTOTAL-HKXB202205041.htm
FU X W, WANG H, XU Z. Research on cooperative pursuit strategy for Multi-UAVs based on DE-MADDPG algorithm[J]. Acta Aeronautica et Astronautica Sinica, 2022, 43(5): 530-543. https://www.cnki.com.cn/Article/CJFDTOTAL-HKXB202205041.htm
|
[25] |
施伟, 冯旸赫, 程光权, 等. 基于深度强化学习的多机协同空战方法研究[J]. 自动化学报, 2021, 47(7): 1610-1623. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO202107012.htm
SHI W, FENG Y H, CHENG G Q, et al. Research on multi-aircraft cooperative air combat method based on deep reinforcement learning[J]. Acta Automatica Sinica, 2021, 47(7): 1610-1623. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO202107012.htm
|
[26] |
叶帅. 基于事件触发自适应动态规划的多四旋翼无人机优化控制[D]. 南京: 南京邮电大学, 2021.
YE S. Optimal control of multi-quadrotor UAV based on event-triggered adaptive dynamic programming[D]. Nanjing: Nanjing University of Posts, 2021.
|
[27] |
LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C/OL]//International Conference on Learning Representations. [2016-01-07]. https://doi.org/10.48550/arXiv.1509.02971. DOI: 10.48550/arXiv.1509.02971.
|