Du, Junlong
LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection
Luo, Yunpeng, Du, Junlong, Yan, Ke, Ding, Shouhong
The evolution of Diffusion Models has dramatically improved image generation quality, making it increasingly difficult to differentiate between real and generated images. This development, while impressive, also raises significant privacy and security concerns. In response to this, we propose a novel Latent REconstruction error guided feature REfinement method (LaRE^2) for detecting the diffusion-generated images. We come up with the Latent Reconstruction Error (LaRE), the first reconstruction-error based feature in the latent space for generated image detection. LaRE surpasses existing methods in terms of feature extraction efficiency while preserving crucial cues required to differentiate between the real and the fake. To exploit LaRE, we propose an Error-Guided feature REfinement module (EGRE), which can refine the image feature guided by LaRE to enhance the discriminativeness of the feature. Our EGRE utilizes an align-then-refine mechanism, which effectively refines the image feature for generated-image detection from both spatial and channel perspectives. Extensive experiments on the large-scale GenImage benchmark demonstrate the superiority of our LaRE^2, which surpasses the best SoTA method by up to 11.9%/12.1% average ACC/AP across 8 different image generators. LaRE also surpasses existing methods in terms of feature extraction cost, delivering an impressive speed enhancement of 8 times.
Parameter-Efficient Fine-Tuning for Pre-Trained Vision Models: A Survey
Xin, Yi, Luo, Siqi, Zhou, Haodi, Du, Junlong, Liu, Xiaohong, Fan, Yue, Li, Qing, Du, Yuntao
Large-scale pre-trained vision models (PVMs) have As a promising solution, parameter-efficient fine-tuning shown great potential for adaptability across various (PEFT), which was originally proposed in NLP, overcomes downstream vision tasks. However, with stateof-the-art the above challenges by updating a minimal number of parameters PVMs growing to billions or even trillions while potentially achieving comparable or superior of parameters, the standard full fine-tuning performance to full fine-tuning [Hu and et al., 2021; Yu and paradigm is becoming unsustainable due to high et al., 2022]. These approaches hinge on recent advances computational and storage demands. In response, showing that large pre-trained models trained with rich data researchers are exploring parameter-efficient finetuning have strong generalisability and most parameters in the PVMs (PEFT), which seeks to exceed the performance could be shared for the new tasks [Kornblith and et al., 2019; of full fine-tuning with minimal parameter Yu and et al., 2022]. PEFT methods could reduce learnable parameters, modifications. This survey provides a comprehensive which not only facilitates more effective adaptation overview and future directions for visual PEFT, to novel tasks but also safeguards the pre-existing knowledge offering a systematic review of the latest advancements.