The rapid advancement of image generation models like Stable Diffusion raises concerns about potential misuse, making robust watermarking techniques essential for the authentication and attribution of synthetic content, particularly in combating deepfakes. However, simultaneously ensuring high-quality image generation and accurate watermark extraction remains challenging. Through an analysis of existing methods, we identify a critical limitation: their loss functions often adopt a single reference (either the input image or the clean-generated image) for optimizing image fidelity, leading to suboptimal performance. In this paper, we conduct an in-depth study of the image-quality loss term in diffusion-based watermarking. By analyzing the distinct impacts of using the input image versus the clean-generated image as references during optimization, we reveal that jointly considering both references significantly improves robustness and visual quality. Extensive experiments demonstrate that our dual-reference approach achieves superior performance in both watermark extraction accuracy and generation fidelity compared to single-reference baselines. We advocate for this paradigm to advance reliable watermarking in generative models.
Visual Quality Improved Watermarking based on Dual-Reference Loss for Deepfake Attribution
Caldelli R.
2025-01-01
Abstract
The rapid advancement of image generation models like Stable Diffusion raises concerns about potential misuse, making robust watermarking techniques essential for the authentication and attribution of synthetic content, particularly in combating deepfakes. However, simultaneously ensuring high-quality image generation and accurate watermark extraction remains challenging. Through an analysis of existing methods, we identify a critical limitation: their loss functions often adopt a single reference (either the input image or the clean-generated image) for optimizing image fidelity, leading to suboptimal performance. In this paper, we conduct an in-depth study of the image-quality loss term in diffusion-based watermarking. By analyzing the distinct impacts of using the input image versus the clean-generated image as references during optimization, we reveal that jointly considering both references significantly improves robustness and visual quality. Extensive experiments demonstrate that our dual-reference approach achieves superior performance in both watermark extraction accuracy and generation fidelity compared to single-reference baselines. We advocate for this paradigm to advance reliable watermarking in generative models.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

