AITopics | Wu, Zihui

Collaborating Authors

Wu, Zihui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

InverseBench: Benchmarking Plug-and-Play Diffusion Priors for Inverse Problems in Physical Sciences

Zheng, Hongkai, Chu, Wenda, Zhang, Bingliang, Wu, Zihui, Wang, Austin, Feng, Berthy T., Zou, Caifeng, Sun, Yu, Kovachki, Nikola, Ross, Zachary E., Bouman, Katherine L., Yue, Yisong

arXiv.org Artificial IntelligenceMar-13-2025

Plug-and-play diffusion priors (PnPDP) have emerged as a promising research direction for solving inverse problems. However, current studies primarily focus on natural image restoration, leaving the performance of these algorithms in scientific inverse problems largely unexplored. To address this gap, we introduce \textsc{InverseBench}, a framework that evaluates diffusion models across five distinct scientific inverse problems. These problems present unique structural challenges that differ from existing benchmarks, arising from critical scientific applications such as optical tomography, medical imaging, black hole imaging, seismology, and fluid dynamics. With \textsc{InverseBench}, we benchmark 14 inverse problem algorithms that use plug-and-play diffusion priors against strong, domain-specific baselines, offering valuable new insights into the strengths and weaknesses of existing algorithms. To facilitate further research and development, we open-source the codebase, along with datasets and pre-trained models, at https://devzhk.github.io/InverseBench/.

artificial intelligence, inverse problem, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.11043

Country:

North America > United States (0.14)
Oceania > Australia (0.14)
North America > Canada (0.14)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

HumorReject: Decoupling LLM Safety from Refusal Prefix via A Little Humor

Wu, Zihui, Gao, Haichang, Luo, Jiacheng, Liu, Zhaoxiang

arXiv.org Artificial IntelligenceFeb-7-2025

Large Language Models (LLMs) commonly rely on explicit refusal prefixes for safety, making them vulnerable to prefix injection attacks. We introduce HumorReject, a novel data-driven approach that reimagines LLM safety by decoupling it from refusal prefixes through humor as an indirect refusal strategy. Rather than explicitly rejecting harmful instructions, HumorReject responds with contextually appropriate humor that naturally defuses potentially dangerous requests. Our approach effectively addresses common "over-defense" issues while demonstrating superior robustness against various attack vectors. Our findings suggest that improvements in training data design can be as important as the alignment algorithm itself in achieving effective LLM safety.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.13677

Country:

Asia > China (0.14)
Asia > Thailand (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.68)
Government (0.46)
Leisure & Entertainment (0.46)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ContextMRI: Enhancing Compressed Sensing MRI through Metadata Conditioning

Chung, Hyungjin, Lee, Dohun, Wu, Zihui, Kim, Byung-Hoon, Bouman, Katherine L., Ye, Jong Chul

arXiv.org Artificial IntelligenceJan-8-2025

Compressed sensing MRI seeks to accelerate MRI acquisition processes by sampling fewer k-space measurements and then reconstructing the missing data algorithmically. The success of these approaches often relies on strong priors or learned statistical models. While recent diffusion model-based priors have shown great potential, previous methods typically ignore clinically available metadata (e.g. patient demographics, imaging parameters, slice-specific information). In practice, metadata contains meaningful cues about the anatomy and acquisition protocol, suggesting it could further constrain the reconstruction problem. In this work, we propose ContextMRI, a text-conditioned diffusion model for MRI that integrates granular metadata into the reconstruction process. We train a pixel-space diffusion model directly on minimally processed, complex-valued MRI images. During inference, metadata is converted into a structured text prompt and fed to the model via CLIP text embeddings. By conditioning the prior on metadata, we unlock more accurate reconstructions and show consistent gains across multiple datasets, acceleration factors, and undersampling patterns. Our experiments demonstrate that increasing the fidelity of metadata, ranging from slice location and contrast to patient age, sex, and pathology, systematically boosts reconstruction performance. This work highlights the untapped potential of leveraging clinical context for inverse problems and opens a new direction for metadata-driven MRI reconstruction.

artificial intelligence, diffusion model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.04284

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (0.40)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

GlitchMiner: Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization

Wu, Zihui, Gao, Haichang, Wang, Ping, Zhang, Shudong, Liu, Zhaoxiang, Lian, Shiguo

arXiv.org Artificial IntelligenceNov-9-2024

Glitch tokens in Large Language Models (LLMs) can trigger unpredictable behaviors, threatening model reliability and safety. Existing detection methods rely on predefined patterns, limiting their adaptability across diverse LLM architectures. We propose GlitchMiner, a gradient-based discrete optimization framework that efficiently identifies glitch tokens by introducing entropy as a measure of prediction uncertainty and employing a local search strategy to explore the token space. Experiments across multiple LLM architectures demonstrate that GlitchMiner outperforms existing methods in detection accuracy and adaptability, achieving over 10% average efficiency improvement. This method enhances vulnerability assessment in LLMs, contributing to the development of more robust and reliable applications.

large language model, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2410.15052

Genre: Research Report > New Finding (0.93)

Industry: Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Principled Probabilistic Imaging using Diffusion Models as Plug-and-Play Priors

Wu, Zihui, Sun, Yu, Chen, Yifan, Zhang, Bingliang, Yue, Yisong, Bouman, Katherine L.

arXiv.org Machine LearningMay-29-2024

Diffusion models (DMs) have recently shown outstanding capability in modeling complex image distributions, making them expressive image priors for solving Bayesian inverse problems. However, most existing DM-based methods rely on approximations in the generative process to be generic to different inverse problems, leading to inaccurate sample distributions that deviate from the target posterior defined within the Bayesian framework. To harness the generative power of DMs while avoiding such approximations, we propose a Markov chain Monte Carlo algorithm that performs posterior sampling for general inverse problems by reducing it to sampling the posterior of a Gaussian denoising problem. Crucially, we leverage a general DM formulation as a unified interface that allows for rigorously solving the denoising problem with a range of state-of-the-art DMs. We demonstrate the effectiveness of the proposed method on six inverse problems (three linear and three nonlinear), including a real-world black hole imaging problem. Experimental results indicate that our proposed method offers more accurate reconstructions and posterior estimation compared to existing DM-based imaging inverse methods.

artificial intelligence, inverse problem, machine learning, (19 more...)

arXiv.org Machine Learning

2405.18782

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Lower Difficulty and Better Robustness: A Bregman Divergence Perspective for Adversarial Training

Wu, Zihui, Gao, Haichang, Zhou, Bingqian, Guo, Xiaoyan, Zhang, Shudong

arXiv.org Artificial IntelligenceJan-5-2024

In this paper, we investigate on improving the adversarial robustness obtained in adversarial training (AT) via reducing the difficulty of optimization. To better study this problem, we build a novel Bregman divergence perspective for AT, in which AT can be viewed as the sliding process of the training data points on the negative entropy curve. Based on this perspective, we analyze the learning objectives of two typical AT methods, i.e., PGD-AT and TRADES, and we find that the optimization process of TRADES is easier than PGD-AT for that TRADES separates PGD-AT. In addition, we discuss the function of entropy in TRADES, and we find that models with high entropy can be better robustness learners. Inspired by the above findings, we propose two methods, i.e., FAIT and MER, which can both not only reduce the difficulty of optimization under the 10-step PGD adversaries, but also provide better robustness. Our work suggests that reducing the difficulty of optimization under the 10-step PGD adversaries is a promising approach for enhancing the adversarial robustness in AT.

artificial intelligence, machine learning, robustness, (16 more...)

arXiv.org Artificial Intelligence

2208.12511

Country:

Europe (0.67)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Provable Probabilistic Imaging using Score-Based Generative Priors

Sun, Yu, Wu, Zihui, Chen, Yifan, Feng, Berthy T., Bouman, Katherine L.

arXiv.org Artificial IntelligenceDec-29-2023

Estimating high-quality images while also quantifying their uncertainty are two desired features in an image reconstruction algorithm for solving ill-posed inverse problems. In this paper, we propose plug-and-play Monte Carlo (PMC) as a principled framework for characterizing the space of possible solutions to a general inverse problem. PMC is able to incorporate expressive score-based generative priors for high-quality image reconstruction while also performing uncertainty quantification via posterior sampling. In particular, we introduce two PMC algorithms which can be viewed as the sampling analogues of the traditional plug-and-play priors (PnP) and regularization by denoising (RED) algorithms. We also establish a theoretical analysis for characterizing the convergence of the PMC algorithms. Our analysis provides non-asymptotic stationarity guarantees for both algorithms, even in the presence of non-log-concave likelihoods and imperfect score networks. We demonstrate the performance of the PMC algorithms on multiple representative inverse problems with both linear and nonlinear forward models. Experimental results show that PMC significantly improves reconstruction quality and enables high-fidelity uncertainty quantification.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2310.10835

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Demystifying Oversmoothing in Attention-Based Graph Neural Networks

Wu, Xinyi, Ajorlou, Amir, Wu, Zihui, Jadbabaie, Ali

arXiv.org Machine LearningOct-5-2023

Oversmoothing in Graph Neural Networks (GNNs) refers to the phenomenon where increasing network depth leads to homogeneous node representations. While previous work has established that Graph Convolutional Networks (GCNs) exponentially lose expressive power, it remains controversial whether the graph attention mechanism can mitigate oversmoothing. In this work, we provide a definitive answer to this question through a rigorous mathematical analysis, by viewing attention-based GNNs as nonlinear time-varying dynamical systems and incorporating tools and techniques from the theory of products of inhomogeneous matrices and the joint spectral radius. We establish that, contrary to popular belief, the graph attention mechanism cannot prevent oversmoothing and loses expressive power exponentially. The proposed framework extends the existing results on oversmoothing for symmetric GCNs to a significantly broader class of GNN models, including random walk GCNs, Graph Attention Networks (GATs) and (graph) transformers.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

2305.16102

Country: North America > United States (0.47)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

AdvFunMatch: When Consistent Teaching Meets Adversarial Robustness

Wu, Zihui, Gao, Haichang, Zhou, Bingqian, Wang, Ping

arXiv.org Artificial IntelligenceMay-24-2023

\emph{Consistent teaching} is an effective paradigm for implementing knowledge distillation (KD), where both student and teacher models receive identical inputs, and KD is treated as a function matching task (FunMatch). However, one limitation of FunMatch is that it does not account for the transfer of adversarial robustness, a model's resistance to adversarial attacks. To tackle this problem, we propose a simple but effective strategy called Adversarial Function Matching (AdvFunMatch), which aims to match distributions for all data points within the $\ell_p$-norm ball of the training data, in accordance with consistent teaching. Formulated as a min-max optimization problem, AdvFunMatch identifies the worst-case instances that maximizes the KL-divergence between teacher and student model outputs, which we refer to as "mismatched examples," and then matches the outputs on these mismatched examples. Our experimental results show that AdvFunMatch effectively produces student models with both high clean accuracy and robustness. Furthermore, we reveal that strong data augmentations (\emph{e.g.}, AutoAugment) are beneficial in AdvFunMatch, whereas prior works have found them less effective in adversarial training. Code is available at \url{https://gitee.com/zihui998/adv-fun-match}.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2305.147

Country:

North America > United States > California (0.93)
Asia (0.68)

Genre: Research Report > New Finding (0.48)

Industry:

Education (1.00)
Information Technology > Security & Privacy (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Scalable Plug-and-Play ADMM with Convergence Guarantees

Sun, Yu, Wu, Zihui, Wohlberg, Brendt, Kamilov, Ulugbek S.

arXiv.org Machine LearningJun-5-2020

Plug-and-play priors (PnP) is a broadly applicable methodology for solving inverse problems by exploiting statistical priors specified as denoisers. Recent work has reported the state-of-the-art performance of PnP algorithms using pre-trained deep neural nets as denoisers in a number of imaging applications. However, current PnP algorithms are impractical in large-scale settings due to their heavy computational and memory requirements. This work addresses this issue by proposing an incremental variant of the widely used PnP-ADMM algorithm, making it scalable to large-scale datasets. We theoretically analyze the convergence of the algorithm under a set of explicit assumptions, extending recent theoretical results in the area. Additionally, we show the effectiveness of our algorithm with nonsmooth data-fidelity terms and deep neural net priors, its fast convergence compared to existing PnP algorithms, and its scalability in terms of speed and memory.

algorithm, deep learning, neural network, (21 more...)

arXiv.org Machine Learning

2006.03224

Country:

North America > United States > California (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report (0.63)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback