Weakly Supervised Semantic Segmentation Network Based on Extended Patch Pairs
-
Graphical Abstract
-
Abstract
In weakly supervised semantic segmentation, class activation maps (CAMs) often suffer from poor correlation with object seeds and incomplete area coverage on targets. To address these defects, we introduce a weakly supervised semantic segmentation network based on extended patch pairs. First, we propose the concept of extended patch pairs and demonstrate, through information theory, that the total self-information of CAMs obtained from extended patch pairs exceeds that of standard CAMs, thus achieving a higher correlation with object seeds. Second, we introduce a higher-lower feature self-attention combination module that enhances low-level features and CAMs through self-attention mechanisms and combines them to refine CAMs pixel by pixel. Finally, we design a triple network architecture that takes the original image and its extended patch pairs as network inputs. By narrowing the gap between the CAM of the original image and that of the extended patch pair, the network achieves higher segmentation accuracy. Experimental evaluations on the Pascal VOC 2012 validation and test sets yielded mean intersection over union (mIoU) scores of 72.1% and 73.0%, respectively. The experimental results show that the performance of this network outperforms current mainstream image-level weakly supervised semantic segmentation methods.
-
-