Poster
D2SP: Dynamic Dual-Stage Purification Framework for Dual Noise Mitigation in Vision-based Affective Recognition.
Haoran Wang · Xinji Mai · Zeng Tao · Xuan Tong · Junxiong Lin · Yan Wang · Jiawen Yu · Shaoqi Yan · Ziheng Zhou · Wenqiang Zhang
The current advancements in Dynamic Facial Expression Recognition (DFER) methods mainly focus on better capturing the spatial and temporal features of facial expressions. However, DFER datasets contain a substantial amount of noisy samples, and few have addressed the issue of handling this noise. We identified two types of noise: one is caused by low-quality data resulting from factors such as occlusion, dim lighting, and blurriness; the other arises from mislabeled data due to annotation bias by annotators. Addressing the two types of noise, we have meticulously crafted a \textbf{D}ynamic \textbf{D}ual-\textbf{S}tage \textbf{P}urification (D2SP) Framework. This initiative aims to dynamically purify the DFER datasets of these two types of noise, ensuring that only high-quality and correctly labeled data is used in the training process. To mitigate low-quality samples, we introduce the Coarse-Grained Pruning (CGP) stage, which computes sample weights and prunes those low-weight samples. After CGP, the Fine-Grained Correction (FGC) stage evaluates prediction stability to correct mislabeled data. Moreover, D2SP is conceived as a general and plug-and-play framework, tailored to integrate seamlessly with prevailing DFER methods. Extensive experiments covering prevalent DFER datasets and deploying multiple benchmark methods have substantiated D2SP’s ability to significantly enhance performance metrics.