一种轻量级空间位置注意力模块及其在图像分类网络中的应用

Lightweight Spatial Location Attention Module and Its Application to Image Classification Networks

  • 摘要: 针对以往的注意力方法忽略了空间位置信息对跨维度交互的重要性的问题,提出了一种轻量级的空间位置注意力模块(spatial location attention module,SLAM),该模块通过3个分支结构计算输入特征图在水平方向、垂直方向和通道方向上的位置信息注意力权重,并沿3个空间方向进行特征聚合得到注意力权重特征图,实现特征图中空间和位置信息注意力权重的自适应调整。基于该模块改进了ResNet18、ResNet50和MobileNetV2网络,并且针对图像分类任务进行了大量的实验,结果表明SLAM能够显著提高模型的性能,且性能优于其他注意力方法。在ImageNet-1K和Stanford-Cars数据集的分类任务上,基于SLAM改进的ResNet18、ResNet50和MobileNetV2网络的Top-1准确率最高分别提升2.62%和2.4%。在废钢评级任务中,基于SLAM改进的YOLOv5s和YOLOv8s网络与基于CBAM (convolutional block attention module)和CA (coordinate attention)改进的这两个网络相比在召回率、F1分数、mAP 0.5:0.95和mAP 0.5等4个指标上均有提高。

     

    Abstract: A lightweight spatial location attention module (SLAM) is proposed to address the shortcomings of previous attention methods that often overlook the critical role of spatial location information in cross-dimensional interactions. This module calculates the location information attention weights of the input feature map across horizontal, vertical, and channel directions through three branch structures, which results in the aggregation of features along the three spatial directions for adaptive adjustment of spatial and positional information attention weights in the feature map. Based on this module, ResNet18, ResNet50, and MobileNetV2 networks are improved, and a large number of experiments are conducted for image classification tasks. The results show that SLAM considerably improves model performance, outperforming other attention methods. In particular, on the classification tasks of ImageNet-1K and Stanford-Cars datasets, the Top-1 accuracy of the ResNet18, ResNet50, and MobileNetV2 networks improved by SLAM is the highest, increasing by 2.62% and 2.4%, respectively. In the scrap steel rating task, the YOLOv5s and YOLOv8s networks enhanced with SLAM show improvements across four indicators: recall, F1 Score, mAP0.5:0.95, and mAP0.5. These results surpass the performance of the networks improved with convolutional block attention module and coordinate attention.

     

/

返回文章
返回