A Self-Supervised Pre-training Method for Heart Rate Signal Measurement Based on MAE and Mamba
-
Graphical Abstract
-
Abstract
In response to the issues of high noise and difficult label acquisition in the long feature sequences of remote PhotoPlethysmoGraphy (rPPG) for physiological signals, this paper proposes a self-supervised pre-trained heart rate signal measurement method based on Masked AutoEncoders (MAE) and Mamba.Firstly, to effectively remove noise, average pooling is applied to the video to generate a Spatio-Temporal Map (STMap). Secondly, to address the problem of difficult label acquisition, the self-supervised MAE method is used to mask and reconstruct the STMap, extracting the inherent self-similar prior information from the signals. Additionally, during the feature extraction stage, leveraging the advantage of Mamba in handling long sequences, it selectively remembers or ignores the input content, filtering out short-term disturbance fluctuations and retaining long-term periodic data, thereby enhancing the model's anti-interference ability. Moreover, compared with the Transformer-based method, it has fewer parameters and a faster inference speed. In terms of model design, considering the characteristic that each row of the STMap represents the time series of the average pooling of each facial region, a Dual-Path Attention Module (DPAM) is proposed to enhance the feature extraction ability of channels and facial regions. The test results on two public datasets show that, compared with the Transformer -based self-supervised method, the Mean Absolute Error (MeanAE) of this method decreases by 17.65% on the UBFC dataset and 17.50% on the PURE dataset. Meanwhile, the number of parameters is reduced by 43%.
-
-