Abstract:
To address the challenges of missed detection and false detection for small objects in unmanned aerial vehicle (UAV) aerial imagery and improve the detection accuracy and real-time performance of small-object detection, this paper proposes a full-link collaborative optimization detection algorithm based on YOLOv11s. A collaborative module for multi-scale feature extraction and cross-scale feature fusion is constructed in the proposed algorithm. The Multi-scale Edge Enhancement Module (MEESA) is embedded to strengthen the edge details and local features of small objects, and the Focused Small Target Feature Strengthening Pyramid (FSFP) module is adopted to implement lightweight cross-scale feature interaction and compensate for the missing feature information between shallow and deep network layers. Furthermore, the feature interaction and prediction mechanism of the network are optimized: the conventional SPPF module is replaced with the AIFI-RepBN (adaptive interactive feature interaction-reparameterized batch normalization) module to enhance the stability of feature interaction and inference efficiency; the Detect-LQE (detection-localization quality estimation) mechanism is introduced to rectify the independent prediction defects of the detection head and realize correlated optimization for bounding box localization and classification confidence scores. Meanwhile, an Inner-EIoU (inner-enhanced intersection over union) loss function is designed to refine the bounding box localization performance of small objects and accelerate model convergence. Experimental results on the VisDrone2019 dataset demonstrate that compared with the original YOLOv11s baseline model, the optimized algorithm achieves 2.5% and 1.6% improvements in mAP@0.5 and mAP@0.5:0.95 respectively, while keeping a lightweight parameter size of 11×10
6. The proposed algorithm effectively mitigates the missed and false detection issues of small objects in UAV aerial images with balanced detection precision and inference speed, which can satisfy the real-time detection requirements of practical application scenarios including UAV-based traffic management and emergency rescue.