Liu, Hanqing
Global Challenge for Safe and Secure LLMs Track 1
Jia, Xiaojun, Huang, Yihao, Liu, Yang, Tan, Peng Yan, Yau, Weng Kuan, Mak, Mun-Thye, Sim, Xin Ming, Ng, Wee Siong, Ng, See Kiong, Liu, Hanqing, Zhou, Lifeng, Yan, Huanqian, Sun, Xiaobing, Liu, Wei, Wang, Long, Qian, Yiming, Liu, Yong, Yang, Junxiao, Zhang, Zhexin, Lei, Leqi, Chen, Renmiao, Lu, Yida, Cui, Shiyao, Wang, Zizhou, Li, Shaohua, Wang, Yan, Goh, Rick Siow Mong, Zhen, Liangli, Zhang, Yingjie, Zhao, Zhe
This paper introduces the Global Challenge for Safe and Secure Large Language Models (LLMs), a pioneering initiative organized by AI Singapore (AISG) and the CyberSG R&D Programme Office (CRPO) to foster the development of advanced defense mechanisms against automated jailbreaking attacks. With the increasing integration of LLMs in critical sectors such as healthcare, finance, and public administration, ensuring these models are resilient to adversarial attacks is vital for preventing misuse and upholding ethical standards. This competition focused on two distinct tracks designed to evaluate and enhance the robustness of LLM security frameworks. Track 1 tasked participants with developing automated methods to probe LLM vulnerabilities by eliciting undesirable responses, effectively testing the limits of existing safety protocols within LLMs. Participants were challenged to devise techniques that could bypass content safeguards across a diverse array of scenarios, from offensive language to misinformation and illegal activities. Through this process, Track 1 aimed to deepen the understanding of LLM vulnerabilities and provide insights for creating more resilient models.
Boosting Jailbreak Transferability for Large Language Models
Liu, Hanqing, Zhou, Lifeng, Yan, Huanqian
Large language models have drawn significant attention to the challenge of safe alignment, especially regarding jailbreak attacks that circumvent security measures to produce harmful content. To address the limitations of existing methods like GCG, which perform well in single-model attacks but lack transferability, we propose several enhancements, including a scenario induction template, optimized suffix selection, and the integration of re-suffix attack mechanism to reduce inconsistent outputs. Our approach has shown superior performance in extensive experiments across various benchmarks, achieving nearly 100% success rates in both attack execution and transferability. Notably, our method has won the first place in the AISG-hosted Global Challenge for Safe and Secure LLMs.
Enriched Physics-informed Neural Networks for Dynamic Poisson-Nernst-Planck Systems
Huang, Xujia, Wang, Fajie, Zhang, Benrong, Liu, Hanqing
This paper proposes a meshless deep learning algorithm, enriched physics-informed neural networks (EPINNs), to solve dynamic Poisson-Nernst-Planck (PNP) equations with strong coupling and nonlinear characteristics. The EPINNs takes the traditional physics-informed neural networks as the foundation framework, and adds the adaptive loss weight to balance the loss functions, which automatically assigns the weights of losses by updating the parameters in each iteration based on the maximum likelihood estimate. The resampling strategy is employed in the EPINNs to accelerate the convergence of loss function. Meanwhile, the GPU parallel computing technique is adopted to accelerate the solving process. Four examples are provided to demonstrate the validity and effectiveness of the proposed method. Numerical results indicate that the new method has better applicability than traditional numerical methods in solving such coupled nonlinear systems. More importantly, the EPINNs is more accurate, stable, and fast than the traditional physics-informed neural networks. This work provides a simple and high-performance numerical tool for addressing PNPs with arbitrary boundary shapes and boundary conditions.