Abstract
Deepfake medical images pose significant risks to clinical diagnosis and treatment planning owing to their high realism. Although many deepfake detection methods exist, their high computational cost limits their practical clinical deployment. This paper proposes a lightweight and efficient deepfake detection framework that combines self-supervised contrastive learning with an attention-enhanced convolutional neural network. The proposed method utilizes a modified MobileNetV2 architecture integrated with Efficient Channel Attention (ECA) modules to enhance feature representation with minimal computational overhead. We employed a two-stage training strategy: self-supervised pre-training on unlabeled CT scans to learn robust features, followed by supervised fine-tuning for the final classification task. The proposed approach achieves an accuracy of 98.39% when trained from scratch. When leveraging ImageNet pre-trained weights prior to self-supervised pre-training, the performance of the model improved significantly, reaching 99.87% accuracy and 100% specificity on the test set. This result achieved a competitive performance that surpassed the evaluated baselines and demonstrated the benefits of combining general-purpose pre-trained ImageNet weights with domain-specific self-supervised learning. The lightweight nature of the proposed ECA-enhanced MobileNetV2 makes it a practical solution in resource-constrained clinical environments.
Keywords
Get full access to this article
View all access options for this article.
