Liang, Chumeng
Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization
Fan, Jiajun, Shen, Shuaike, Cheng, Chaoran, Chen, Yuxin, Liang, Chumeng, Liu, Ge
Recent advancements in reinforcement learning (RL) have achieved great success in fine-tuning diffusion-based generative models. However, fine-tuning continuous flow-based generative models to align with arbitrary user-defined reward functions remains challenging, particularly due to issues such as policy collapse from overoptimization and the prohibitively high computational cost of likelihoods in continuous-time flows. In this paper, we propose an easy-to-use and theoretically sound RL fine-tuning method, which we term Online Reward-Weighted Conditional Flow Matching with Wasserstein-2 Regularization (ORW-CFM-W2). Our method integrates RL into the flow matching framework to fine-tune generative models with arbitrary reward functions, without relying on gradients of rewards or filtered datasets. By introducing an online reward-weighting mechanism, our approach guides the model to prioritize high-reward regions in the data manifold. To prevent policy collapse and maintain diversity, we incorporate Wasserstein-2 (W2) distance regularization into our method and derive a tractable upper bound for it in flow matching, effectively balancing exploration and exploitation of policy optimization. We provide theoretical analyses to demonstrate the convergence properties and induced data distributions of our method, establishing connections with traditional RL algorithms featuring Kullback-Leibler (KL) regularization and offering a more comprehensive understanding of the underlying mechanisms and learning behavior of our approach. Extensive experiments on tasks including target image generation, image compression, and text-image alignment demonstrate the effectiveness of our method, where our method achieves optimal policy convergence while allowing controllable trade-offs between reward maximization and diversity preservation.
AMM: Adaptive Modularized Reinforcement Model for Multi-city Traffic Signal Control
Huang, Zherui, Liu, Yicheng, Liang, Chumeng, Zheng, Guanjie
Traffic signal control (TSC) is an important and widely studied direction. Recently, reinforcement learning (RL) methods have been used to solve TSC problems and achieve superior performance over conventional TSC methods. However, applying RL methods to the real world is challenging due to the huge cost of experiments in real-world traffic environments. One possible solution is TSC domain adaptation, which adapts trained models to target environments and reduces the number of interactions and the training cost. However, existing TSC domain adaptation methods still face two major issues: the lack of consideration for differences across cities and the low utilization of multi-city data. To solve aforementioned issues, we propose an approach named Adaptive Modularized Model (AMM). By modularizing TSC problems and network models, we overcome the challenge of possible changes in environmental observations. We also aggregate multi-city experience through meta-learning. We conduct extensive experiments on different cities and show that AMM can achieve excellent performance with limited interactions in target environments and outperform existing methods. We also demonstrate the feasibility and generalizability of our method.
Real-World Benchmarks Make Membership Inference Attacks Fail on Diffusion Models
Liang, Chumeng, You, Jiaxuan
Membership inference attacks (MIAs) on diffusion models have emerged as potential evidence of unauthorized data usage in training pre-trained diffusion models. These attacks aim to detect the presence of specific images in training datasets of diffusion models. Our study delves into the evaluation of state-of-the-art MIAs on diffusion models and reveals critical flaws and overly optimistic performance estimates in existing MIA evaluation. We introduce CopyMark, a more realistic MIA benchmark that distinguishes itself through the support for pre-trained diffusion models, unbiased datasets, and fair evaluation pipelines. Through extensive experiments, we demonstrate that the effectiveness of current MIA methods significantly degrades under these more practical conditions. Based on our results, we alert that MIA, in its current state, is not a reliable approach for identifying unauthorized data usage in pre-trained diffusion models. To the best of our knowledge, we are the first to discover the performance overestimation of MIAs on diffusion models and present a unified benchmark for more realistic evaluation. Our code is available on GitHub: \url{https://github.com/caradryanl/CopyMark}.
CGI-DM: Digital Copyright Authentication for Diffusion Models via Contrasting Gradient Inversion
Wu, Xiaoyu, Hua, Yang, Liang, Chumeng, Zhang, Jiaru, Wang, Hao, Song, Tao, Guan, Haibing
Diffusion Models (DMs) have evolved into advanced image generation tools, especially for few-shot generation where a pretrained model is fine-tuned on a small set of images to capture a specific style or object. Despite their success, concerns exist about potential copyright violations stemming from the use of unauthorized data in this process. In response, we present Contrasting Gradient Inversion for Diffusion Models (CGI-DM), a novel method featuring vivid visual representations for digital copyright authentication. Our approach involves removing partial information of an image and recovering missing details by exploiting conceptual differences between the pretrained and fine-tuned models. We formulate the differences as KL divergence between latent variables of the two models when given the same input image, which can be maximized through Monte Carlo sampling and Projected Gradient Descent (PGD). The similarity between original and recovered images serves as a strong indicator of potential infringements. Extensive experiments on the WikiArt and Dreambooth datasets demonstrate the high accuracy of CGI-DM in digital copyright authentication, surpassing alternative validation techniques. Code implementation is available at https://github.com/Nicholas0228/Revelio.
Toward effective protection against diffusion based mimicry through score distillation
Xue, Haotian, Liang, Chumeng, Wu, Xiaoyu, Chen, Yongxin
While generative diffusion models excel in producing high-quality images, they can also be misused to mimic authorized images, posing a significant threat to AI systems. Efforts have been made to add calibrated perturbations to protect images from diffusion-based mimicry pipelines. However, most of the existing methods are too ineffective and even impractical to be used by individual users due to their high computation and memory requirements. In this work, we present novel findings on attacking latent diffusion models (LDM) and propose new plug-and-play strategies for more effective protection. In particular, we explore the bottleneck in attacking an LDM, discovering that the encoder module rather than the denoiser module is the vulnerable point. Based on this insight, we present our strategy using Score Distillation Sampling (SDS) to double the speed of protection and reduce memory occupation by half without compromising its strength. Additionally, we provide a robust protection strategy by counterintuitively minimizing the semantic loss, which can assist in generating more natural perturbations. Finally, we conduct extensive experiments to substantiate our findings and comprehensively evaluate our newly proposed strategies. We hope our insights and protective measures can contribute to better defense against malicious diffusion-based mimicry, advancing the development of secure AI systems. The code is available in https://github.com/xavihart/Diff-Protect
Understanding and Improving Adversarial Attacks on Latent Diffusion Model
Zheng, Boyang, Liang, Chumeng, Wu, Xiaoyu, Liu, Yan
Adversarial attacks on LDM are then born to protect unauthorized images from being used in LDMdriven few-shot generation. However, these attacks suffer from moderate performance and excessive computational cost, especially in GPU memory. In this paper, we propose an effective adversarial attack on LDM that shows superior performance against state-of-the-art few-shot generation pipeline of LDM, for example, LoRA. We implement the attack with memory efficiency by introducing several mechanisms and decrease the memory cost of the attack to less than 6GB, which allows individual users to run the attack on a majority of consumer GPUs. The adversarial budget is 4/255. Diffusion models (Sohl-Dickstein et al., 2015; Song & Ermon, 2019; Ho et al., 2020; Song et al., 2020) have long held the promise of producing fine-grained content that could resemble real data. Figure 2: Few-shot generation based on adversarial examples outputs low-quality images. in few-shot generation--generating data with few-shot reference data--has pushed the state-of-theart performance forward by a significant margin and sparked a craze for AI-generated art (Meng et al., 2021; Gal et al., 2022; Ruiz et al., 2023; Roich et al., 2022; Zhang & Agrawala, 2023). While the opportunities presented by LDM are immense, the implications of its power are a doubleedged sword. Malicious individuals leverage LDM-driven few-shot generation to copy artworks without authorization (Fan et al., 2023) and create fake not-suitable-for-work photos with personal figures (Wang et al., 2023b). Such malevolent applications of LDM threaten the sanctity of personal data and intellectual property. Recognizing the need, adversarial attacks on LDM were born as countermeasures (Salman et al., 2023; Liang et al., 2023; Shan et al., 2023; Van Le et al., 2023). These attacks add human-invisible perturbations to the real image and transfer it to an adversarial example, making it unusable in LDM-driven few-shot generation. Applications based on these adversarial attacks (Liang & Wu, 2023; Shan et al., 2023) serve as a tool to protect personal images from being used as reference data for LDM-driven few-shot generation. However, existing adversarial attacks on LDM suffer from moderate effectiveness.
FDTI: Fine-grained Deep Traffic Inference with Roadnet-enriched Graph
Liu, Zhanyu, Liang, Chumeng, Zheng, Guanjie, Wei, Hua
This paper proposes the fine-grained traffic prediction task (e.g. interval between data points is 1 minute), which is essential to traffic-related downstream applications. Under this setting, traffic flow is highly influenced by traffic signals and the correlation between traffic nodes is dynamic. As a result, the traffic data is non-smooth between nodes, and hard to utilize previous methods which focus on smooth traffic data. To address this problem, we propose Fine-grained Deep Traffic Inference, termed as FDTI. Specifically, we construct a fine-grained traffic graph based on traffic signals to model the inter-road relations. Then, a physically-interpretable dynamic mobility convolution module is proposed to capture vehicle moving dynamics controlled by the traffic signals. Furthermore, traffic flow conservation is introduced to accurately infer future volume. Extensive experiments demonstrate that our method achieves state-of-the-art performance and learned traffic dynamics with good properties. To the best of our knowledge, we are the first to conduct the city-level fine-grained traffic prediction.
Adversarial Example Does Good: Preventing Painting Imitation from Diffusion Models via Adversarial Examples
Liang, Chumeng, Wu, Xiaoyu, Hua, Yang, Zhang, Jiaru, Xue, Yiming, Song, Tao, Xue, Zhengui, Ma, Ruhui, Guan, Haibing
Recently, Diffusion Models (DMs) boost a wave in AI for Art yet raise new copyright concerns, where infringers benefit from using unauthorized paintings to train DMs to generate novel paintings in a similar style. To address these emerging copyright violations, in this paper, we are the first to explore and propose to utilize adversarial examples for DMs to protect human-created artworks. Specifically, we first build a theoretical framework to define and evaluate the adversarial examples for DMs. Then, based on this framework, we design a novel algorithm, named AdvDM, which exploits a Monte-Carlo estimation of adversarial examples for DMs by optimizing upon different latent variables sampled from the reverse process of DMs. Extensive experiments show that the generated adversarial examples can effectively hinder DMs from extracting their features. Therefore, our method can be a powerful tool for human artists to protect their copyright against infringers equipped with DM-based AI-for-Art applications. The code of our method is available on GitHub: https://github.com/mist-project/mist.git.
CBLab: Supporting the Training of Large-scale Traffic Control Policies with Scalable Traffic Simulation
Liang, Chumeng, Huang, Zherui, Liu, Yicheng, Liu, Zhanyu, Zheng, Guanjie, Shi, Hanyuan, Wu, Kan, Du, Yuhao, Li, Fuliang, Li, Zhenhui
Traffic simulation provides interactive data for the optimization of traffic control policies. However, existing traffic simulators are limited by their lack of scalability and shortage in input data, which prevents them from generating interactive data from traffic simulation in the scenarios of real large-scale city road networks. In this paper, we present \textbf{C}ity \textbf{B}rain \textbf{Lab}, a toolkit for scalable traffic simulation. CBLab consists of three components: CBEngine, CBData, and CBScenario. CBEngine is a highly efficient simulator supporting large-scale traffic simulation. CBData includes a traffic dataset with road network data of 100 cities all around the world. We also develop a pipeline to conduct a one-click transformation from raw road networks to input data of our traffic simulation. Combining CBEngine and CBData allows researchers to run scalable traffic simulations in the road network of real large-scale cities. Based on that, CBScenario implements an interactive environment and a benchmark for two scenarios of traffic control policies respectively, with which traffic control policies adaptable for large-scale urban traffic can be trained and tuned. To the best of our knowledge, CBLab is the first infrastructure supporting traffic control policy optimization in large-scale urban scenarios. CBLab has supported the City Brain Challenge @ KDD CUP 2021. The project is available on GitHub:~\url{https://github.com/CityBrainLab/CityBrainLab.git}.
Mist: Towards Improved Adversarial Examples for Diffusion Models
Liang, Chumeng, Wu, Xiaoyu
Diffusion Models (DMs) have empowered great success in artificial-intelligence-generated content, especially in artwork creation, yet raising new concerns in intellectual properties and copyright. For example, infringers can make profits by imitating non-authorized human-created paintings with DMs. Recent researches suggest that various adversarial examples for diffusion models can be effective tools against these copyright infringements. However, current adversarial examples show weakness in transferability over different painting-imitating methods and robustness under straightforward adversarial defense, for example, noise purification. We surprisingly find that the transferability of adversarial examples can be significantly enhanced by exploiting a fused and modified adversarial loss term under consistent parameters. In this work, we comprehensively evaluate the cross-method transferability of adversarial examples. The experimental observation shows that our method generates more transferable adversarial examples with even stronger robustness against the simple adversarial defense.