基于上下文感知融合的轻量球类运动检测方法

Lightweight Object Detection Method Based on Context-Aware Fusion in Ball Sports Scenes

  • 摘要: 在背景复杂的场景中,遮挡、尺度变化和快速移动等因素使得目标检测面临诸多挑战。针对这些问题,提出了一种基于Transformer的高效复杂运动场景中目标检测算法。首先,提出一种轻量特征筛选LFS模块替换主干网中的模块,有效兼顾了模型的计算开销与检测精度。其次,设计了强化上下文感知的特征融合结构CEF-BiFP该结构借鉴了BiFPN架构优势,并引入全局-局部空间注意力替代传统用于通道变换的卷积结构,实现了高效的信息聚合与尺度对齐。最后,将基线模型使用的GIoU损失函数替换为WIoU v3函数,以提升模型检测精度和收敛速度。实验结果表明,改进后的模型的平均精度均值在Basketball Detect数据集和修改扩充的SportsMOT数据集(SportsMOT++)上较基线模型RT-DETR分别提升了2.5%和2.1%,参数量下降了33%,同时在模型大小和计算量上也得到了显著优化。

     

    Abstract: In complex scenes, object detection faces significant challenges due to occlusion, scale variation, and fast motion. To address these issues, this paper proposes an efficient object detection algorithm for complex ball sports scenes based on Transformer. First, a novel lightweight feature selection LFS module is introduced to replace the original backbone blocks, which effectively balances detection accuracy and computational cost. Second, a context-enhanced feature fusion structure CEF-BiFPN is designed, drawing on the advantages of the BiFPN architecture. It incorporates a global-local spatial attention mechanism to replace conventional convolutional transformations, enabling more efficient information aggregation and scale alignment. Finally, the GIoU loss is replaced with WIoU v3 loss to improve detection accuracy and convergence speed. Experimental results show that the improved model achieves mAP gains of 2.5% and 2.1% over the baseline model RT-DETR on the Basketball Detect and the expanded SportsMOT dataset (SportsMOT++), respectively, while reducing the number of parameters by 33%, with significant optimizations in model size and computational complexity.

     

/

返回文章
返回