AITopics | Chen, Chun

Collaborating Authors

Chen, Chun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Label-Only Membership Inference Attack against Pre-trained Large Language Models

He, Yu, Li, Boheng, Liu, Liu, Ba, Zhongjie, Dong, Wei, Li, Yiming, Qin, Zhan, Ren, Kui, Chen, Chun

arXiv.org Artificial IntelligenceFeb-26-2025

Membership Inference Attacks (MIAs) aim to predict whether a data sample belongs to the model's training set or not. Although prior research has extensively explored MIAs in Large Language Models (LLMs), they typically require accessing to complete output logits (\ie, \textit{logits-based attacks}), which are usually not available in practice. In this paper, we study the vulnerability of pre-trained LLMs to MIAs in the \textit{label-only setting}, where the adversary can only access generated tokens (text). We first reveal that existing label-only MIAs have minor effects in attacking pre-trained LLMs, although they are highly effective in inferring fine-tuning datasets used for personalized LLMs. We find that their failure stems from two main reasons, including better generalization and overly coarse perturbation. Specifically, due to the extensive pre-training corpora and exposing each sample only a few times, LLMs exhibit minimal robustness differences between members and non-members. This makes token-level perturbations too coarse to capture such differences. To alleviate these problems, we propose \textbf{PETAL}: a label-only membership inference attack based on \textbf{PE}r-\textbf{T}oken sem\textbf{A}ntic simi\textbf{L}arity. Specifically, PETAL leverages token-level semantic similarity to approximate output probabilities and subsequently calculate the perplexity. It finally exposes membership based on the common assumption that members are `better' memorized and have smaller perplexity. We conduct extensive experiments on the WikiMIA benchmark and the more challenging MIMIR benchmark. Empirically, our PETAL performs better than the extensions of existing label-only attacks against personalized LLMs and even on par with other advanced logit-based attacks across all metrics on five prevalent open-source LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.18943

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uncertainty-Aware Graph Structure Learning

Han, Shen, Zhou, Zhiyao, Chen, Jiawei, Hao, Zhezheng, Zhou, Sheng, Wang, Gang, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceFeb-18-2025

Graph Neural Networks (GNNs) have become a prominent approach for learning from graph-structured data. However, their effectiveness can be significantly compromised when the graph structure is suboptimal. To address this issue, Graph Structure Learning (GSL) has emerged as a promising technique that refines node connections adaptively. Nevertheless, we identify two key limitations in existing GSL methods: 1) Most methods primarily focus on node similarity to construct relationships, while overlooking the quality of node information. Blindly connecting low-quality nodes and aggregating their ambiguous information can degrade the performance of other nodes. 2) The constructed graph structures are often constrained to be symmetric, which may limit the model's flexibility and effectiveness. To overcome these limitations, we propose an Uncertainty-aware Graph Structure Learning (UnGSL) strategy. UnGSL estimates the uncertainty of node information and utilizes it to adjust the strength of directional connections, where the influence of nodes with high uncertainty is adaptively reduced.Importantly, UnGSL serves as a plug-in module that can be seamlessly integrated into existing GSL methods with minimal additional computational cost. In our experiments, we implement UnGSL into six representative GSL methods, demonstrating consistent performance improvements. The code is available at https://github.com/UnHans/UnGSL.

artificial intelligence, machine learning, node, (16 more...)

arXiv.org Artificial Intelligence

2502.12618

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.84)

Add feedback

Membership Inference Attacks Against Vision-Language Models

Hu, Yuke, Li, Zheng, Liu, Zhihao, Zhang, Yang, Qin, Zhan, Ren, Kui, Chen, Chun

arXiv.org Artificial IntelligenceFeb-7-2025

Vision-Language Models (VLMs), built on pre-trained vision encoders and large language models (LLMs), have shown exceptional multi-modal understanding and dialog capabilities, positioning them as catalysts for the next technological revolution. However, while most VLM research focuses on enhancing multi-modal interaction, the risks of data misuse and leakage have been largely unexplored. This prompts the need for a comprehensive investigation of such risks in VLMs. In this paper, we conduct the first analysis of misuse and leakage detection in VLMs through the lens of membership inference attack (MIA). In specific, we focus on the instruction tuning data of VLMs, which is more likely to contain sensitive or unauthorized information. To address the limitation of existing MIA methods, we introduce a novel approach that infers membership based on a set of samples and their sensitivity to temperature, a unique parameter in VLMs. Based on this, we propose four membership inference methods, each tailored to different levels of background knowledge, ultimately arriving at the most challenging scenario. Our comprehensive evaluations show that these methods can accurately determine membership status, e.g., achieving an AUC greater than 0.8 targeting a small set consisting of only 5 samples on LLaVA.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.18624

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Mitigating Privacy Risks in LLM Embeddings from Embedding Inversion

Liu, Tiantian, Yao, Hongwei, Wu, Tong, Qin, Zhan, Lin, Feng, Ren, Kui, Chen, Chun

arXiv.org Artificial IntelligenceNov-6-2024

Embeddings have become a cornerstone in the functionality of large language models (LLMs) due to their ability to transform text data into rich, dense numerical representations that capture semantic and syntactic properties. These embedding vector databases serve as the long-term memory of LLMs, enabling efficient handling of a wide range of natural language processing tasks. However, the surge in popularity of embedding vector databases in LLMs has been accompanied by significant concerns about privacy leakage. Embedding vector databases are particularly vulnerable to embedding inversion attacks, where adversaries can exploit the embeddings to reverse-engineer and extract sensitive information from the original text data. Existing defense mechanisms have shown limitations, often struggling to balance security with the performance of downstream tasks. To address these challenges, we introduce Eguard, a novel defense mechanism designed to mitigate embedding inversion attacks. Eguard employs a transformer-based projection network and text mutual information optimization to safeguard embeddings while preserving the utility of LLMs. Our approach significantly reduces privacy risks, protecting over 95% of tokens from inversion while maintaining high performance across downstream tasks consistent with original embeddings.

inversion attack, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2411.05034

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation

Yang, Weiqin, Chen, Jiawei, Xin, Xin, Zhou, Sheng, Hu, Binbin, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceOct-31-2024

Softmax Loss (SL) is widely applied in recommender systems (RS) and has demonstrated effectiveness. This work analyzes SL from a pairwise perspective, revealing two significant limitations: 1) the relationship between SL and conventional ranking metrics like DCG is not sufficiently tight; 2) SL is highly sensitive to false negative instances. Our analysis indicates that these limitations are primarily due to the use of the exponential function. To address these issues, this work extends SL to a new family of loss functions, termed Pairwise Softmax Loss (PSL), which replaces the exponential function in SL with other appropriate activation functions. While the revision is minimal, we highlight three merits of PSL: 1) it serves as a tighter surrogate for DCG with suitable activation functions; 2) it better balances data contributions; and 3) it acts as a specific BPR loss enhanced by Distributionally Robust Optimization (DRO).

artificial intelligence, machine learning, psl, (19 more...)

arXiv.org Artificial Intelligence

2411.00163

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
(2 more...)

Add feedback

Conditional Image Synthesis with Diffusion Models: A Survey

Zhan, Zheyuan, Chen, Defang, Mei, Jian-Ping, Zhao, Zhenghe, Chen, Jiawei, Chen, Chun, Lyu, Siwei, Wang, Can

arXiv.org Artificial IntelligenceOct-3-2024

Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches in the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the essential sampling process. All discussions are centered around popular applications. Finally, we pinpoint some critical yet still open problems to be solved in the future and suggest some possible solutions. Our reviewed works are itemized at https://github.com/zju-pi/Awesome-Conditional-Diffusion-Models.

diffusion model, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2409.19365

Country:

Asia (0.67)
North America > United States > New York > Erie County > Buffalo (0.14)

Genre: Overview (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

Towards Dynamic Graph Neural Networks with Provably High-Order Expressive Power

Wang, Zhe, Zhao, Tianjian, Zhang, Zhen, Chen, Jiawei, Zhou, Sheng, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceOct-2-2024

Dynamic Graph Neural Networks (DyGNNs) have garnered increasing research attention for learning representations on evolving graphs. Despite their effectiveness, the limited expressive power of existing DyGNNs hinders them from capturing important evolving patterns of dynamic graphs. Although some works attempt to enhance expressive capability with heuristic features, there remains a lack of DyGNN frameworks with provable and quantifiable high-order expressive power. To address this research gap, we firstly propose the k-dimensional Dynamic WL tests (k-DWL) as the referencing algorithms to quantify the expressive power of DyGNNs. We demonstrate that the expressive power of existing DyGNNs is upper bounded by the 1-DWL test. To enhance the expressive power, we propose Dynamic Graph Neural Network with High-order expressive power (HopeDGN), which updates the representation of central node pair by aggregating the interaction history with neighboring node pairs. Our theoretical results demonstrate that HopeDGN can achieve expressive power equivalent to the 2-DWL test. We then present a Transformer-based implementation for the local variant of HopeDGN. Experimental results show that HopeDGN achieved performance improvements of up to 3.12%, demonstrating the effectiveness of HopeDGN.

artificial intelligence, expressive power, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.01367

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Motif-driven Subgraph Structure Learning for Graph Classification

Zhou, Zhiyao, Zhou, Sheng, Mao, Bochao, Chen, Jiawei, Sun, Qingyun, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceJun-13-2024

To mitigate the suboptimal nature of graph structure, Graph Structure Learning (GSL) has emerged as a promising approach to improve graph structure and boost performance in downstream tasks. Despite the proposal of numerous GSL methods, the progresses in this field mostly concentrated on node-level tasks, while graph-level tasks (e.g., graph classification) remain largely unexplored. Notably, applying node-level GSL to graph classification is non-trivial due to the lack of find-grained guidance for intricate structure learning. Inspired by the vital role of subgraph in graph classification, in this paper we explore the potential of subgraph structure learning for graph classification by tackling the challenges of key subgraph selection and structure optimization. We propose a novel Motif-driven Subgraph Structure Learning method for Graph Classification (MOSGSL). Specifically, MOSGSL incorporates a subgraph structure learning module which can adaptively select important subgraphs. A motif-driven structure guidance module is further introduced to capture key subgraph-level structural patterns (motifs) and facilitate personalized structure learning. Extensive experiments demonstrate a significant and consistent improvement over baselines, as well as its flexibility and generalizability for various backbones and learning procedures.

artificial intelligence, machine learning, subgraph, (14 more...)

arXiv.org Artificial Intelligence

2406.08897

Country:

North America > United States (0.28)
Asia > China > Zhejiang Province (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (1.00)

Add feedback

How Do Recommendation Models Amplify Popularity Bias? An Analysis from the Spectral Perspective

Lin, Siyi, Gao, Chongming, Chen, Jiawei, Zhou, Sheng, Hu, Binbin, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceJun-13-2024

Recommendation Systems (RS) are often plagued by popularity bias. When training a recommendation model on a typically long-tailed dataset, the model tends to not only inherit this bias but often exacerbate it, resulting in over-representation of popular items in the recommendation lists. This study conducts comprehensive empirical and theoretical analyses to expose the root causes of this phenomenon, yielding two core insights: 1) Item popularity is memorized in the principal spectrum of the score matrix predicted by the recommendation model; 2) The dimension collapse phenomenon amplifies the relative prominence of the principal spectrum, thereby intensifying the popularity bias. Building on these insights, we propose a novel debiasing strategy that leverages a spectral norm regularizer to penalize the magnitude of the principal singular value. We have developed an efficient algorithm to expedite the calculation of the spectral norm by exploiting the spectral property of the score matrix. Extensive experiments across seven real-world datasets and three testing paradigms have been conducted to validate the superiority of the proposed method.

artificial intelligence, machine learning, popularity bias, (16 more...)

arXiv.org Artificial Intelligence

2404.12008

Country:

Asia > China (0.29)
Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Knowledge Translation: A New Pathway for Model Compression

Sun, Wujie, Chen, Defang, Chen, Jiawei, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceJan-11-2024

Deep learning has witnessed significant advancements in recent years at the cost of increasing training, inference, and model storage overhead. While existing model compression methods strive to reduce the number of model parameters while maintaining high accuracy, they inevitably necessitate the re-training of the compressed model or impose architectural constraints. To overcome these limitations, this paper presents a novel framework, termed \textbf{K}nowledge \textbf{T}ranslation (KT), wherein a ``translation'' model is trained to receive the parameters of a larger model and generate compressed parameters. The concept of KT draws inspiration from language translation, which effectively employs neural networks to convert different languages, maintaining identical meaning. Accordingly, we explore the potential of neural networks to convert models of disparate sizes, while preserving their functionality. We propose a comprehensive framework for KT, introduce data augmentation strategies to enhance model performance despite restricted training data, and successfully demonstrate the feasibility of KT on the MNIST dataset. Code is available at \url{https://github.com/zju-SWJ/KT}.

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Artificial Intelligence

2401.05772

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback