Skip to yearly menu bar Skip to main content


Poster

SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model

Zhenglin Huang · Jinwei Hu · Yiwei He · Xiangtai Li · Xiaowei Huang · Bei Peng · Xingyu Zhao · Baoyuan Wu · Guangliang Cheng

[ ] [ Paper PDF ]
[ Poster
Sun 15 Jun 2 p.m. PDT — 4 p.m. PDT

Abstract: The rapid advancement of generative models in creating highly realistic images poses substantial risks for misinformation dissemination. For instance, a synthetic image, when shared on social media, can mislead extensive audiences and erode trust in digital content, resulting in severe repercussions. Despite some progress, academia has not yet created a large and diversified deepfake detection dataset for social media, nor has it devised an effective solution to address this issue. In this paper, we introduce the $\textbf{S}$ocial media $\textbf{I}$mage $\textbf{D}$etection data$\textbf{Set}$ (SID-Set), which offers three key advantages: (1) $\textbf{extensive volume}$, featuring 300K AI-generated/tampered and authentic images with comprehensive annotations, (2) $\textbf{broad diversity}$, encompassing fully synthetic and tampered images across various classes, and (3) $\textbf{elevated realism}$, with images that are predominantly indistinguishable from genuine ones through mere visual inspection. Furthermore, leveraging the exceptional capabilities of large multimodal models, we propose a new image deepfake detection, localization, and explanation framework, named SIDA ($\textbf{S}$ocial media $\textbf{I}$mage $\textbf{D}$etection, localization, and explanation $\textbf{A}$ssistant). SIDA not only discerns the authenticity of images, but also delineates tampered regions through mask prediction and provides textual explanations of the model's judgment criteria. Compared with state-of-the-art deepfake detection models on SID-Set and other benchmarks, extensive experiments demonstrate that SIDA achieves superior performance among diversified settings. The code, model, and dataset will be released.

Chat is not available.