Abstract:
In this study, we propose a distributed multi-experience pool collision avoidance method with a deep deterministic policy gradient (DMEP-DDPG) to meet the demand for efficient autonomous collision avoidance in cooperative tasks of multiple unmanned aerial vehicles (multi-UAVs). The proposed method is based on data-driven reinforcement learning methods and enables a single UAV to rely on its own sensor data for autonomous collision avoidance operations in a multi-UAV environment. For this, we first design a bootstrap reward function-based system payoff mechanism to address the sparse reward problem of reinforcement learning tasks with long periods. Since the low sample efficiency of a single experience pool results in difficult policy convergence, we construct a novel distributed multi-experience pool updated deterministic policy gradient framework to overcome this problem. Finally, we test the collision avoidance performance of the DMEP-DDPG method in multi-UAV collaborative missions and compare performance metrics with other learning-based collision avoidance strategies. Our experimental results verify the feasibility and effectiveness of the DMEP-DDPG method.