MetaCloak-JPEG: JPEG-Robust Adversarial Perturbation for Preventing Unauthorized DreamBooth-Based Deepfake Generation
2026-04-20 • Computer Vision and Pattern Recognition
Computer Vision and Pattern Recognition
AI summaryⓘ
The authors address a problem where attackers can create deepfake images using only a few public photos, which current protection methods struggle to stop because they don't consider how social media compresses images using JPEG. They introduce MetaCloak-JPEG, a new technique that models JPEG compression in a way that helps protect images better against these attacks. By simulating how JPEG affects images during training, their method keeps protective changes effective even after compression, outperforming previous defenses. This makes it harder for attackers to use small numbers of photos to make harmful fake images.
DeepfakeDiffusion ModelDreamBoothJPEG CompressionAdversarial PerturbationDifferentiable JPEGStraight-Through EstimatorMeta-learningPSNRImage Protection
Authors
Tanjim Rahaman Fardin, S M Zunaid Alam, Mahadi Hasan Fahim, Md Faysal Mahfuz
Abstract
The rapid progress of subject-driven text-to-image synthesis, and in particular DreamBooth, has enabled a consent-free deepfake pipeline: an adversary needs only 4-8 publicly available face images to fine-tune a personalized diffusion model and produce photorealistic harmful content. Current adversarial face-protection systems -- PhotoGuard, Anti-DreamBooth, and MetaCloak -- perturb user images to disrupt surrogate fine-tuning, but all share a structural blindness: none backpropagates gradients through the JPEG compression pipeline that every major social-media platform applies before adversary access. Because JPEG quantization relies on round(), whose derivative is zero almost everywhere, adversarial energy concentrates in high-frequency DCT bands that JPEG discards, eliminating 60-80% of the protective signal. We introduce MetaCloak-JPEG, which closes this gap by inserting a Differentiable JPEG (DiffJPEG) layer built on the Straight-Through Estimator (STE): the forward pass applies standard JPEG compression, while the backward pass replaces round() with the identity. DiffJPEG is embedded in a JPEG-aware EOT distribution (~70% of augmentations include DiffJPEG) and a curriculum quality-factor schedule (QF: 95 to 50) inside a bilevel meta-learning loop. Under an l-inf perturbation budget of eps=8/255, MetaCloak-JPEG attains 32.7 dB PSNR, a 91.3% JPEG survival rate, and outperforms PhotoGuard on all 9 evaluated JPEG quality factors (9/9 wins, mean denoising-loss gain +0.125) within a 4.1 GB training-memory budget.