AITopics | Zeng, Shenglai

Collaborating Authors

Zeng, Shenglai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Empowering GraphRAG with Knowledge Filtering and Integration

Guo, Kai, Shomer, Harry, Zeng, Shenglai, Han, Haoyu, Wang, Yu, Tang, Jiliang

arXiv.org Artificial IntelligenceMar-17-2025

In recent years, large language models (LLMs) have revolutionized the field of natural language processing. However, they often suffer from knowledge gaps and hallucinations. Graph retrieval-augmented generation (GraphRAG) enhances LLM reasoning by integrating structured knowledge from external graphs. However, we identify two key challenges that plague GraphRAG:(1) Retrieving noisy and irrelevant information can degrade performance and (2)Excessive reliance on external knowledge suppresses the model's intrinsic reasoning. To address these issues, we propose GraphRAG-FI (Filtering and Integration), consisting of GraphRAG-Filtering and GraphRAG-Integration. GraphRAG-Filtering employs a two-stage filtering mechanism to refine retrieved information. GraphRAG-Integration employs a logits-based selection strategy to balance external knowledge from GraphRAG with the LLM's intrinsic reasoning,reducing over-reliance on retrievals. Experiments on knowledge graph QA tasks demonstrate that GraphRAG-FI significantly improves reasoning performance across multiple backbone models, establishing a more reliable and effective GraphRAG framework.

information, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.13804

Country: North America > United States > Colorado > Weld County > Greeley (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Towards Context-Robust LLMs: A Gated Representation Fine-tuning Approach

Zeng, Shenglai, He, Pengfei, Guo, Kai, Zheng, Tianqi, Lu, Hanqing, Xing, Yue, Liu, Hui

arXiv.org Artificial IntelligenceFeb-22-2025

Large Language Models (LLMs) enhanced with external contexts, such as through retrieval-augmented generation (RAG), often face challenges in handling imperfect evidence. They tend to over-rely on external knowledge, making them vulnerable to misleading and unhelpful contexts. To address this, we propose the concept of context-robust LLMs, which can effectively balance internal knowledge with external context, similar to human cognitive processes. Specifically, context-robust LLMs should rely on external context only when lacking internal knowledge, identify contradictions between internal and external knowledge, and disregard unhelpful contexts. To achieve this goal, we introduce Grft, a lightweight and plug-and-play gated representation fine-tuning approach. Grft consists of two key components: a gating mechanism to detect and filter problematic inputs, and low-rank representation adapters to adjust hidden representations. By training a lightweight intervention function with only 0.0004\% of model size on fewer than 200 examples, Grft can effectively adapt LLMs towards context-robust behaviors.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.141

Country:

Asia (1.00)
North America > United States (0.28)
Europe > France (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Unveiling Mode Connectivity in Graph Neural Networks

Li, Bingheng, Chen, Zhikai, Han, Haoyu, Zeng, Shenglai, Liu, Jingzhe, Tang, Jiliang

arXiv.org Artificial IntelligenceFeb-18-2025

A fundamental challenge in understanding graph neural networks (GNNs) lies in characterizing their optimization dynamics and loss landscape geometry, critical for improving interpretability and robustness. While mode connectivity, a lens for analyzing geometric properties of loss landscapes has proven insightful for other deep learning architectures, its implications for GNNs remain unexplored. This work presents the first investigation of mode connectivity in GNNs. We uncover that GNNs exhibit distinct non-linear mode connectivity, diverging from patterns observed in fully-connected networks or CNNs. Crucially, we demonstrate that graph structure, rather than model architecture, dominates this behavior, with graph properties like homophily correlating with mode connectivity patterns. We further establish a link between mode connectivity and generalization, proposing a generalization bound based on loss barriers and revealing its utility as a diagnostic tool. Our findings further bridge theoretical insights with practical implications: they rationalize domain alignment strategies in graph learning and provide a foundation for refining GNN training paradigms.

artificial intelligence, machine learning, mode connectivity, (14 more...)

arXiv.org Artificial Intelligence

2502.12608

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Unveiling Privacy Risks in LLM Agent Memory

Wang, Bo, He, Weiyi, He, Pengfei, Zeng, Shenglai, Xiang, Zhen, Xing, Yue, Tang, Jiliang

arXiv.org Artificial IntelligenceFeb-17-2025

Large Language Model (LLM) agents have become increasingly prevalent across various real-world applications. They enhance decision-making by storing private user-agent interactions in the memory module for demonstrations, introducing new privacy risks for LLM agents. In this work, we systematically investigate the vulnerability of LLM agents to our proposed Memory EXTRaction Attack (MEXTRA) under a black-box setting. To extract private information from memory, we propose an effective attacking prompt design and an automated prompt generation method based on different levels of knowledge about the LLM agent. Experiments on two representative agents demonstrate the effectiveness of MEXTRA. Moreover, we explore key factors influencing memory leakage from both the agent's and the attacker's perspectives. Our findings highlight the urgent need for effective memory safeguards in LLM agent design and deployment.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2502.13172

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Towards Knowledge Checking in Retrieval-augmented Generation: A Representation Perspective

Zeng, Shenglai, Zhang, Jiankun, Li, Bingheng, Lin, Yuping, Zheng, Tianqi, Everaert, Dante, Lu, Hanqing, Liu, Hui, Liu, Hui, Xing, Yue, Cheng, Monica Xiao, Tang, Jiliang

arXiv.org Artificial IntelligenceNov-21-2024

Retrieval-Augmented Generation (RAG) systems have shown promise in enhancing the performance of Large Language Models (LLMs). However, these systems face challenges in effectively integrating external knowledge with the LLM's internal knowledge, often leading to issues with misleading or unhelpful information. This work aims to provide a systematic study on knowledge checking in RAG systems. We conduct a comprehensive analysis of LLM representation behaviors and demonstrate the significance of using representations in knowledge checking. Motivated by the findings, we further develop representation-based classifiers for knowledge filtering. We show substantial improvements in RAG performance, even when dealing with noisy knowledge databases. Our study provides new insights into leveraging LLM representations for enhancing the reliability and effectiveness of RAG systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2411.14572

Country:

North America > Mexico (0.28)
North America > United States (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.96)

Add feedback

The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG)

Zeng, Shenglai, Zhang, Jiankun, He, Pengfei, Xing, Yue, Liu, Yiding, Xu, Han, Ren, Jie, Wang, Shuaiqiang, Yin, Dawei, Chang, Yi, Tang, Jiliang

arXiv.org Artificial IntelligenceFeb-23-2024

On the other 2023; Shi et al., 2023) is an advanced natural language hand, the retrieval process in RAG could also influence processing technique that enhances text generation the behavior of the LLMs for text-generation, by integrating information retrieved from and this could possibly cause the LLMs to output a large corpus of documents. These techniques private information from its training/fine-tuning enable RAG to produce accurate and contextually dataset. Notably, there are existing works (Carlini relevant outputs with augmented external knowledge et al., 2021; Kandpal et al., 2022; Lee et al., and have been widely used in various scenarios 2021; Carlini et al., 2022; Zeng et al., 2023) observing such as domain-specific chatbots (Siriwardhana that LLMs can remember and leak private et al., 2023) and email/code completion (Parvez information from their pre-training and fine-tuning et al., 2021). RAG systems typically work in two data. However, how the integration of external retrieval phases, as shown in Fig 1 - retrieval and generation.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.16893

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Copyright Protection in Generative AI: A Technical Perspective

Ren, Jie, Xu, Han, He, Pengfei, Cui, Yingqian, Zeng, Shenglai, Zhang, Jiankun, Wen, Hongzhi, Ding, Jiayuan, Liu, Hui, Chang, Yi, Tang, Jiliang

arXiv.org Artificial IntelligenceFeb-3-2024

Generative AI has witnessed rapid advancement in recent years, expanding their capabilities to create synthesized content such as text, images, audio, and code. The high fidelity and authenticity of contents generated by these Deep Generative Models (DGMs) have sparked significant copyright concerns. There have been various legal debates on how to effectively safeguard copyrights in DGMs. This work delves into this issue by providing a comprehensive overview of copyright protection from a technical perspective. We examine from two distinct viewpoints: the copyrights pertaining to the source data held by the data owners and those of the generative models maintained by the model builders. For data copyright, we delve into methods data owners can protect their content and DGMs can be utilized without infringing upon these rights. For model copyright, our discussion extends to strategies for preventing model theft and identifying outputs generated by specific models. Finally, we highlight the limitations of existing techniques and identify areas that remain unexplored. Furthermore, we discuss prospective directions for the future of copyright protection, underscoring its importance for the sustainable and ethical development of Generative AI.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.02333

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)
Europe > Germany (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls and New Benchmarking

Li, Juanhui, Shomer, Harry, Mao, Haitao, Zeng, Shenglai, Ma, Yao, Shah, Neil, Tang, Jiliang, Yin, Dawei

arXiv.org Artificial IntelligenceNov-18-2023

Link prediction attempts to predict whether an unseen edge exists based on only a portion of edges of a graph. A flurry of methods have been introduced in recent years that attempt to make use of graph neural networks (GNNs) for this task. Furthermore, new and diverse datasets have also been created to better evaluate the effectiveness of these new models. However, multiple pitfalls currently exist that hinder our ability to properly evaluate these new methods. These pitfalls mainly include: (1) Lower than actual performance on multiple baselines, (2) A lack of a unified data split and evaluation metric on some datasets, and (3) An unrealistic evaluation setting that uses easy negative samples. To overcome these challenges, we first conduct a fair comparison across prominent methods and datasets, utilizing the same dataset and hyperparameter search settings. We then create a more practical evaluation setting based on a Heuristic Related Sampling Technique (HeaRT), which samples hard negative samples via multiple heuristics. The new evaluation setting helps promote new challenges and opportunities in link prediction by aligning the evaluation with real-world situations. Our implementation and data are available at https://github.com/Juanhui28/HeaRT

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.10453

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.68)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Exploring Memorization in Fine-tuned Language Models

Zeng, Shenglai, Li, Yaxin, Ren, Jie, Liu, Yiding, Xu, Han, He, Pengfei, Xing, Yue, Wang, Shuaiqiang, Tang, Jiliang, Yin, Dawei

arXiv.org Artificial IntelligenceOct-10-2023

LLMs have shown great capabilities in various tasks but also exhibited memorization of training data, thus raising tremendous privacy and copyright concerns. While prior work has studied memorization during pre-training, the exploration of memorization during fine-tuning is rather limited. Compared with pre-training, fine-tuning typically involves sensitive data and diverse objectives, thus may bring unique memorization behaviors and distinct privacy risks. In this work, we conduct the first comprehensive analysis to explore LMs' memorization during fine-tuning across tasks. Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that fine-tuned memorization presents a strong disparity among tasks. We provide an understanding of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution. By investigating its memorization behavior, multi-task fine-tuning paves a potential strategy to mitigate fine-tuned memorization.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.06714

Country:

North America > United States (1.00)
Africa > Middle East > Libya (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (0.93)
Law > Criminal Law (0.93)
Health & Medicine > Therapeutic Area (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

On the Generalization of Training-based ChatGPT Detection Methods

Xu, Han, Ren, Jie, He, Pengfei, Zeng, Shenglai, Cui, Yingqian, Liu, Amy, Liu, Hui, Tang, Jiliang

arXiv.org Artificial IntelligenceOct-3-2023

ChatGPT is one of the most popular language models which achieve amazing performance on various natural language tasks. Consequently, there is also an urgent need to detect the texts generated ChatGPT from human written. One of the extensively studied methods trains classification models to distinguish both. However, existing studies also demonstrate that the trained models may suffer from distribution shifts (during test), i.e., they are ineffective to predict the generated texts from unseen language tasks or topics. In this work, we aim to have a comprehensive investigation on these methods' generalization behaviors under distribution shift caused by a wide range of factors, including prompts, text lengths, topics, and language tasks. To achieve this goal, we first collect a new dataset with human and ChatGPT texts, and then we conduct extensive studies on the collected dataset. Our studies unveil insightful findings which provide guidance for developing future methodologies or data collection strategies for ChatGPT detection.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2310.01307

Country: North America > United States > Michigan > Ingham County (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (1.00)
Banking & Finance (1.00)
Media > Film (0.94)
Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback