傅双杰, 陈玮, 尹钟. 结合自注意力和特征自适应融合的语义分割算法[J]. 信息与控制, 2022, 51(6): 680-687, 698. DOI: 10.13976/j.cnki.xk.2022.1584
引用本文: 傅双杰, 陈玮, 尹钟. 结合自注意力和特征自适应融合的语义分割算法[J]. 信息与控制, 2022, 51(6): 680-687, 698. DOI: 10.13976/j.cnki.xk.2022.1584
FU Shuangjie, CHEN Wei, YIN Zhong. Semantic Segmentation Algorithm Combining Self-attention and Feature Adaptive Fusion[J]. INFORMATION AND CONTROL, 2022, 51(6): 680-687, 698. DOI: 10.13976/j.cnki.xk.2022.1584
Citation: FU Shuangjie, CHEN Wei, YIN Zhong. Semantic Segmentation Algorithm Combining Self-attention and Feature Adaptive Fusion[J]. INFORMATION AND CONTROL, 2022, 51(6): 680-687, 698. DOI: 10.13976/j.cnki.xk.2022.1584

结合自注意力和特征自适应融合的语义分割算法

Semantic Segmentation Algorithm Combining Self-attention and Feature Adaptive Fusion

  • 摘要: 针对场景图像语义分割任务中存在多尺度目标以及特征提取网络缺乏对全局上下文信息的获取等问题,设计了一种嵌入改进自注意力机制以及自适应融合多尺度特征的双路径分割算法。在空间路径利用双分支的简易下采样模块进行4倍下采样提取高分辨率的边缘细节信息,使网络对目标边界分割更精确。在语义路径嵌入上下文捕获模块和自适应特征融合模块,为解码阶段提供具有丰富多尺度的高语义上下文信息,并采用类别平衡策略进一步提升分割效果。经过实验验证,该模型在Camvid和Aeroscapes数据集上的MIOU(mean intersection over union)指标分别为59.4%和60.1%,具有较好的分割效果。

     

    Abstract: In this study, we design a dual-path segmentation algorithm with an embedded improved self-attention mechanism and adaptive fusion of multi-scale features to solve the existence of multiscale targets in the scene image semantic segmentation task and the lack of global context information acquisition in the feature extraction network. We use the simple downsampling module with double branches in the spacial path to perform downsampling four times to extract high-resolution edge detail information, allowing the network to segment the object boundary accurately. Next, we embed the context capture and adaptive feature fusion modules in the semantic path to provide rich multiscale high semantic context information for the decoding stages and adopt a category balance strategy to further enhance the segmentation effect. After experimental verification, the model obtain the indicators of the mean intersection over union (MIOU) of the proposed model are 59.4% and 60.1% on the Camvid and Aeroscapes datasets, respectively, and has a good segmentation effect.

     

/

返回文章
返回