Jin, Haibo
FedDAG: Federated Domain Adversarial Generation Towards Generalizable Medical Image Analysis
Che, Haoxuan, Wu, Yifei, Jin, Haibo, Xia, Yong, Chen, Hao
--Federated domain generalization aims to train a global model from multiple source domains and ensure its generalization ability to unseen target domains. Due to the target domain being with unknown domain shifts, attempting to approximate these gaps by source domains may be the key to improving model generalization capability. Existing works mainly focus on sharing and recombining local domain-specific attributes to increase data diversity and simulate potential domain shifts. However, these methods may be insufficient since only the local attribute recombination can be hard to touch the out-of-distribution of global data. In this paper, we propose a simple-yet-efficient framework named Federated Domain Adversarial Generation (FedDAG). It aims to simulate the domain shift and improve the model generalization by adversarially generating novel domains different from local and global source domains. Specifically, it generates novel-style images by maximizing the instance-level feature discrepancy between original and generated images and trains a generalizable task model by minimizing their feature discrepancy. Further, we observed that FedDAG could cause different performance improvements for local models. It may be due to inherent data isolation and heterogeneity among clients, exacerbating the imbalance in their generalization contributions to the global model. Ignoring this imbalance can lead the global model's generalization ability to be sub-optimal, further limiting the novel domain generation procedure. Thus, to mitigate this imbalance, FedDAG hierarchically aggregates local models at the within-client and across-client levels by using the sharpness concept to evaluate client model generalization contributions. Extensive experiments across four medical benchmarks demonstrate FedDAG's ability to enhance generalization in federated medical scenarios. ITH the continuous advancement of medical research and clinical practice, the medical field generates substantial data [1], [2]. This work was supported by the National Natural Science Foundation of China (No. 62202403), Hong Kong Innovation and Technology Fund (Project No. MHP/002/22) and Shenzhen Science and Technology Innovation Committee Fund (Project No. SGDX20210823103201011). H. Che and H. Jin are with the Department of Computer Science and Engineering, at the Hong Kong University of Science and Technology University, Hong Kong SAR, China. Xia are with the National Engineering Laboratory for Integrated Aero-Space-Ground-Ocean Big Data Application Technology, School of Computer Science and Engineering, Northwestern Polytechnical University, Xi'an 710072, China.
Revolve: Optimizing AI Systems by Tracking Response Evolution in Textual Optimization
Zhang, Peiyan, Jin, Haibo, Hu, Leyang, Li, Xinnuo, Kang, Liying, Luo, Man, Song, Yangqiu, Wang, Haohan
Recent advancements in large language models (LLMs) have significantly enhanced the ability of LLM-based systems to perform complex tasks through natural language processing and tool interaction. However, optimizing these LLM-based systems for specific tasks remains challenging, often requiring manual interventions like prompt engineering and hyperparameter tuning. Existing automatic optimization methods, such as textual feedback-based techniques (e.g., TextGrad), tend to focus on immediate feedback, analogous to using immediate derivatives in traditional numerical gradient descent. However, relying solely on such feedback can be limited when the adjustments made in response to this feedback are either too small or fluctuate irregularly, potentially slowing down or even stalling the optimization process. To overcome these challenges, more adaptive methods are needed, especially in situations where the system's response is evolving slowly or unpredictably. In this paper, we introduce REVOLVE, an optimization method that tracks how "R"esponses "EVOLVE" across iterations in LLM systems. By focusing on the evolution of responses over time, REVOLVE enables more stable and effective optimization by making thoughtful, progressive adjustments at each step. Experimental results demonstrate that REVOLVE outperforms competitive baselines, achieving a 7.8% improvement in prompt optimization, a 20.72% gain in solution refinement, and a 29.17% increase in code optimization. Additionally, REVOLVE converges in fewer iterations, resulting in significant computational savings. These advantages highlight its adaptability and efficiency, positioning REVOLVE as a valuable tool for optimizing LLM-based systems and accelerating the development of next-generation AI technologies. Code is available at: https://github.com/Peiyance/REVOLVE.
Large Language Model with Region-guided Referring and Grounding for CT Report Generation
Chen, Zhixuan, Bie, Yequan, Jin, Haibo, Chen, Hao
Computed tomography (CT) report generation is crucial to assist radiologists in interpreting CT volumes, which can be time-consuming and labor-intensive. Existing methods primarily only consider the global features of the entire volume, making it struggle to focus on specific regions and potentially missing abnormalities. To address this issue, we propose Reg2RG, the first region-guided referring and grounding framework for CT report generation, which enhances diagnostic performance by focusing on anatomical regions within the volume. Specifically, we utilize masks from a universal segmentation module to capture local features for each referring region. A local feature decoupling (LFD) strategy is proposed to preserve the local high-resolution details with little computational overhead. Then the local features are integrated with global features to capture inter-regional relationships within a cohesive context. Moreover, we propose a novel region-report alignment (RRA) training strategy. It leverages the recognition of referring regions to guide the generation of region-specific reports, enhancing the model's referring and grounding capabilities while also improving the report's interpretability. A large language model (LLM) is further employed as the language decoder to generate reports from integrated visual features, facilitating region-level comprehension. Extensive experiments on two large-scale chest CT-report datasets demonstrate the superiority of our method, which outperforms several state-of-the-art methods in terms of both natural language generation and clinical efficacy metrics while preserving promising interpretability. The code will be made publicly available.
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
Jin, Haibo, Hu, Leyang, Li, Xinuo, Zhang, Peiyan, Chen, Chonghan, Zhuang, Jun, Wang, Haohan
The rapid evolution of artificial intelligence (AI) through developments in Large Language Models (LLMs) and Vision-Language Models (VLMs) has brought significant advancements across various technological domains. While these models enhance capabilities in natural language processing and visual interactive tasks, their growing adoption raises critical concerns regarding security and ethical alignment. This survey provides an extensive review of the emerging field of jailbreaking--deliberately circumventing the ethical and operational boundaries of LLMs and VLMs--and the consequent development of defense mechanisms. Our study categorizes jailbreaks into seven distinct types and elaborates on defense strategies that address these vulnerabilities. Through this comprehensive examination, we identify research gaps and propose directions for future studies to enhance the security frameworks of LLMs and VLMs. Our findings underscore the necessity for a unified perspective that integrates both jailbreak strategies and defensive solutions to foster a robust, secure, and reliable environment for the next generation of language models. More details can be found on our website: \url{https://chonghan-chen.com/llm-jailbreak-zoo-survey/}.
Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters
Jin, Haibo, Zhou, Andy, Menke, Joe D., Wang, Haohan
Large Language Models (LLMs) are typically harmless but remain vulnerable to carefully crafted prompts known as ``jailbreaks'', which can bypass protective measures and induce harmful behavior. Recent advancements in LLMs have incorporated moderation guardrails that can filter outputs, which trigger processing errors for certain malicious questions. Existing red-teaming benchmarks often neglect to include questions that trigger moderation guardrails, making it difficult to evaluate jailbreak effectiveness. To address this issue, we introduce JAMBench, a harmful behavior benchmark designed to trigger and evaluate moderation guardrails. JAMBench involves 160 manually crafted instructions covering four major risk categories at multiple severity levels. Furthermore, we propose a jailbreak method, JAM (Jailbreak Against Moderation), designed to attack moderation guardrails using jailbreak prefixes to bypass input-level filters and a fine-tuned shadow model functionally equivalent to the guardrail model to generate cipher characters to bypass output-level filters. Our extensive experiments on four LLMs demonstrate that JAM achieves higher jailbreak success ($\sim$ $\times$ 19.88) and lower filtered-out rates ($\sim$ $\times$ 1/6) than baselines.
Design as Desired: Utilizing Visual Question Answering for Multimodal Pre-training
Su, Tongkun, Li, Jun, Zhang, Xi, Jin, Haibo, Chen, Hao, Wang, Qiong, Lv, Faqin, Zhao, Baoliang, Hu, Yin
Multimodal pre-training demonstrates its potential in the medical domain, which learns medical visual representations from paired medical reports. However, many pre-training tasks require extra annotations from clinicians, and most of them fail to explicitly guide the model to learn the desired features of different pathologies. To the best of our knowledge, we are the first to utilize Visual Question Answering (VQA) for multimodal pre-training to guide the framework focusing on targeted pathological features. In this work, we leverage descriptions in medical reports to design multi-granular question-answer pairs associated with different diseases, which assist the framework in pre-training without requiring extra annotations from experts. We also propose a novel pre-training framework with a quasi-textual feature transformer, a module designed to transform visual features into a quasi-textual space closer to the textual domain via a contrastive learning strategy. This narrows the vision-language gap and facilitates modality alignment. Our framework is applied to four downstream tasks: report generation, classification, segmentation, and detection across five datasets. Extensive experiments demonstrate the superiority of our framework compared to other state-of-the-art methods. Our code will be released upon acceptance.
GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models
Jin, Haibo, Chen, Ruoxi, Zhou, Andy, Chen, Jinyin, Zhang, Yang, Wang, Haohan
The discovery of "jailbreaks" to bypass safety filters of Large Language Models (LLMs) and harmful responses have encouraged the community to implement safety measures. One major safety measure is to proactively test the LLMs with jailbreaks prior to the release. Therefore, such testing will require a method that can generate jailbreaks massively and efficiently. In this paper, we follow a novel yet intuitive strategy to generate jailbreaks in the style of the human generation. We propose a role-playing system that assigns four different roles to the user LLMs to collaborate on new jailbreaks. Furthermore, we collect existing jailbreaks and split them into different independent characteristics using clustering frequency and semantic patterns sentence by sentence. We organize these characteristics into a knowledge graph, making them more accessible and easier to retrieve. Our system of different roles will leverage this knowledge graph to generate new jailbreaks, which have proved effective in inducing LLMs to generate unethical or guideline-violating responses. In addition, we also pioneer a setting in our system that will automatically follow the government-issued guidelines to generate jailbreaks to test whether LLMs follow the guidelines accordingly. We refer to our system as GUARD (Guideline Upholding through Adaptive Role-play Diagnostics). We have empirically validated the effectiveness of GUARD on three cutting-edge open-sourced LLMs (Vicuna-13B, LongChat-7B, and Llama-2-7B), as well as a widely-utilized commercial LLM (ChatGPT). Moreover, our work extends to the realm of vision language models (MiniGPT-v2 and Gemini Vision Pro), showcasing GUARD's versatility and contributing valuable insights for the development of safer, more reliable LLM-based applications across diverse modalities.
TraffNet: Learning Causality of Traffic Generation for What-if Prediction
Xu, Ming, Ai, Qiang, Li, Ruimin, Ma, Yunyi, Qi, Geqi, Meng, Xiangfu, Jin, Haibo
Real-time what-if traffic prediction is crucial for decision making in intelligent traffic management and control. Although current deep learning methods demonstrate significant advantages in traffic prediction, they are powerless in what-if traffic prediction due to their nature of correlation-based. Here, we present a simple deep learning framework called TraffNet that learns the mechanisms of traffic generation for what-if prediction from vehicle trajectory data. First, we use a heterogeneous graph to represent the road network, allowing the model to incorporate causal features of traffic flows, such as Origin-Destination (OD) demands and routes. Next, we propose a method for learning segment representations, which involves modeling the process of assigning OD demands onto the road network. The learned segment representations effectively encapsulate the intricate causes of traffic generation, facilitating downstream what-if traffic prediction. Finally, we conduct experiments on synthetic datasets to evaluate the effectiveness of TraffNet. The code and datasets of TraffNet is available at https://github.com/mayunyi-1999/TraffNet_code.git.
PromptMRG: Diagnosis-Driven Prompts for Medical Report Generation
Jin, Haibo, Che, Haoxuan, Lin, Yi, Chen, Hao
Automatic medical report generation (MRG) is of great research value as it has the potential to relieve radiologists from the heavy burden of report writing. Despite recent advancements, accurate MRG remains challenging due to the need for precise clinical understanding and the identification of clinical findings. Moreover, the imbalanced distribution of diseases makes the challenge even more pronounced, as rare diseases are underrepresented in training data, making their diagnostic performance unreliable. To address these challenges, we propose diagnosis-driven prompts for medical report generation (PromptMRG), a novel framework that aims to improve the diagnostic accuracy of MRG with the guidance of diagnosis-aware prompts. Specifically, PromptMRG is based on encoder-decoder architecture with an extra disease classification branch. When generating reports, the diagnostic results from the classification branch are converted into token prompts to explicitly guide the generation process. To further improve the diagnostic accuracy, we design cross-modal feature enhancement, which retrieves similar reports from the database to assist the diagnosis of a query image by leveraging the knowledge from a pre-trained CLIP. Moreover, the disease imbalanced issue is addressed by applying an adaptive logit-adjusted loss to the classification branch based on the individual learning status of each disease, which overcomes the barrier of text decoder's inability to manipulate disease distributions. Experiments on two MRG benchmarks show the effectiveness of the proposed method, where it obtains state-of-the-art clinical efficacy performance on both datasets.
CertPri: Certifiable Prioritization for Deep Neural Networks via Movement Cost in Feature Space
Zheng, Haibin, Chen, Jinyin, Jin, Haibo
Deep neural networks (DNNs) have demonstrated their outperformance in various software systems, but also exhibit misbehavior and even result in irreversible disasters. Therefore, it is crucial to identify the misbehavior of DNN-based software and improve DNNs' quality. Test input prioritization is one of the most appealing ways to guarantee DNNs' quality, which prioritizes test inputs so that more bug-revealing inputs can be identified earlier with limited time and manual labeling efforts. However, the existing prioritization methods are still limited from three aspects: certifiability, effectiveness, and generalizability. To overcome the challenges, we propose CertPri, a test input prioritization technique designed based on a movement cost perspective of test inputs in DNNs' feature space. CertPri differs from previous works in three key aspects: (1) certifiable: it provides a formal robustness guarantee for the movement cost; (2) effective: it leverages formally guaranteed movement costs to identify malicious bug-revealing inputs; and (3) generic: it can be applied to various tasks, data, models, and scenarios. Extensive evaluations across 2 tasks (i.e., classification and regression), 6 data forms, 4 model structures, and 2 scenarios (i.e., white-box and black-box) demonstrate CertPri's superior performance. For instance, it significantly improves 53.97% prioritization effectiveness on average compared with baselines. Its robustness and generalizability are 1.41~2.00 times and 1.33~3.39 times that of baselines on average, respectively.