AITopics | Xiong, Feiyu

Collaborating Authors

Xiong, Feiyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System

Zhao, Jihao, Ji, Zhiyuan, Fan, Zhaoxin, Wang, Hanyu, Niu, Simin, Tang, Bo, Xiong, Feiyu, Li, Zhiyu

arXiv.org Artificial IntelligenceMar-12-2025

Retrieval-Augmented Generation (RAG), while serving as a viable complement to large language models (LLMs), often overlooks the crucial aspect of text chunking within its pipeline. This paper initially introduces a dual-metric evaluation method, comprising Boundary Clarity and Chunk Stickiness, to enable the direct quantification of chunking quality. Leveraging this assessment method, we highlight the inherent limitations of traditional and semantic chunking in handling complex contextual nuances, thereby substantiating the necessity of integrating LLMs into chunking process. To address the inherent trade-off between computational efficiency and chunking precision in LLM-based approaches, we devise the granularity-aware Mixture-of-Chunkers (MoC) framework, which consists of a three-stage processing mechanism. Notably, our objective is to guide the chunker towards generating a structured list of chunking regular expressions, which are subsequently employed to extract chunks from the original text. Extensive experiments demonstrate that both our proposed metrics and the MoC framework effectively settle challenges of the chunking task, revealing the chunking kernel while enhancing the performance of the RAG system.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.096

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models

Liang, Xun, Wang, Hanyu, Lai, Huayi, Niu, Simin, Song, Shichao, Yang, Jiawei, Zhao, Jihao, Xiong, Feiyu, Tang, Bo, Li, Zhiyu

arXiv.org Artificial IntelligenceMar-10-2025

Large Language Models have achieved remarkable success across various natural language processing tasks, yet their high computational cost during inference remains a major bottleneck. This paper introduces Sparse Expert Activation Pruning (SEAP), a training-free pruning method that selectively retains task-relevant parameters to reduce inference overhead. Inspired by the clustering patterns of hidden states and activations in LLMs, SEAP identifies task-specific expert activation patterns and prunes the model while preserving task performance and enhancing computational efficiency. Experimental results demonstrate that SEAP significantly reduces computational overhead while maintaining competitive accuracy. Notably, at 50% pruning, SEAP surpasses both WandA and FLAP by over 20%, and at 20% pruning, it incurs only a 2.2% performance drop compared to the dense model. These findings highlight SEAP's scalability and effectiveness, making it a promising approach for optimizing large-scale LLMs.

large language model, natural language, pruning, (18 more...)

arXiv.org Artificial Intelligence

2503.07605

Country:

Asia (0.46)
North America > United States > Illinois (0.14)
Europe > United Kingdom > Scotland (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Consumer Products & Services > Restaurants (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

SurveyX: Academic Survey Automation via Large Language Models

Liang, Xun, Yang, Jiawei, Wang, Yezhaohui, Tang, Chen, Zheng, Zifan, Song, Shichao, Lin, Zehao, Yang, Yebin, Niu, Simin, Wang, Hanyu, Tang, Bo, Xiong, Feiyu, Mao, Keming, li, Zhiyu

arXiv.org Artificial IntelligenceFeb-27-2025

Large Language Models (LLMs) have demonstrated exceptional comprehension capabilities and a vast knowledge base, suggesting that LLMs can serve as efficient tools for automated survey generation. However, recent research related to automated survey generation remains constrained by some critical limitations like finite context window, lack of in-depth content discussion, and absence of systematic evaluation frameworks. Inspired by human writing processes, we propose SurveyX, an efficient and organized system for automated survey generation that decomposes the survey composing process into two phases: the Preparation and Generation phases. By innovatively introducing online reference retrieval, a pre-processing method called AttributeTree, and a re-polishing process, SurveyX significantly enhances the efficacy of survey composition. Experimental evaluation results show that SurveyX outperforms existing automated survey generation systems in content quality (0.259 improvement) and citation quality (1.76 enhancement), approaching human expert performance across multiple evaluation dimensions. Examples of surveys generated by SurveyX are available on www.surveyx.cn

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.14776

Country:

Asia > China (0.49)
North America > United States (0.46)
North America > Mexico > Mexico City (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation

Liu, Hao, Wang, Zhengren, Chen, Xi, Li, Zhiyu, Xiong, Feiyu, Yu, Qinhan, Zhang, Wentao

arXiv.org Artificial IntelligenceFeb-17-2025

Retrieval-Augmented Generation (RAG) systems often struggle with imperfect retrieval, as traditional retrievers focus on lexical or semantic similarity rather than logical relevance. To address this, we propose HopRAG, a novel RAG framework that augments retrieval with logical reasoning through graph-structured knowledge exploration. During indexing, HopRAG constructs a passage graph, with text chunks as vertices and logical connections established via LLM-generated pseudo-queries as edges. During retrieval, it employs a retrieve-reason-prune mechanism: starting with lexically or semantically similar passages, the system explores multi-hop neighbors guided by pseudo-queries and LLM reasoning to identify truly relevant ones. Extensive experiments demonstrate HopRAG's superiority, achieving 76.78\% higher answer accuracy and 65.07\% improved retrieval F1 score compared to conventional methods. The repository is available at https://github.com/LIU-Hao-2002/HopRAG.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.12442

Country: North America > United States (0.93)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports > Soccer (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

Liang, Xun, Niu, Simin, Li, Zhiyu, Zhang, Sensen, Wang, Hanyu, Xiong, Feiyu, Fan, Jason Zhaoxin, Tang, Bo, Song, Shichao, Wang, Mengwei, Yang, Jiawei

arXiv.org Artificial IntelligenceJan-28-2025

The indexing-retrieval-generation paradigm of retrieval-augmented generation (RAG) has been highly successful in solving knowledge-intensive tasks by integrating external knowledge into large language models (LLMs). However, the incorporation of external and unverified knowledge increases the vulnerability of LLMs because attackers can perform attack tasks by manipulating knowledge. In this paper, we introduce a benchmark named SafeRAG designed to evaluate the RAG security. First, we classify attack tasks into silver noise, inter-context conflict, soft ad, and white Denial-of-Service. Next, we construct RAG security evaluation dataset (i.e., SafeRAG dataset) primarily manually for each task. We then utilize the SafeRAG dataset to simulate various attack scenarios that RAG may encounter. Experiments conducted on 14 representative RAG components demonstrate that RAG exhibits significant vulnerability to all attack tasks and even the most apparent attack task can easily bypass existing retrievers, filters, or advanced LLMs, resulting in the degradation of RAG service quality. Code is available at: https://github.com/IAAR-Shanghai/SafeRAG.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.18636

Country:

North America > United States (0.69)
Asia > China > Shanghai > Shanghai (0.24)

Genre: Overview (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance > Economy (1.00)
Government > Regional Government > North America Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GRAPHMOE: Amplifying Cognitive Depth of Mixture-of-Experts Network via Introducing Self-Rethinking Mechanism

Tang, Chen, Lv, Bo, Zheng, Zifan, Yang, Bohao, Zhao, Kun, Liao, Ning, Wang, Xiaoxing, Xiong, Feiyu, Li, Zhiyu, Liu, Nayu, Jiang, Jingchi

arXiv.org Artificial IntelligenceJan-14-2025

Traditional Mixture-of-Experts (MoE) networks benefit from utilizing multiple smaller expert models as opposed to a single large network. However, these experts typically operate independently, leaving a question open about whether interconnecting these models could enhance the performance of MoE networks. In response, we introduce GRAPHMOE, a novel method aimed at augmenting the cognitive depth of language models via a self-rethinking mechanism constructed on Pseudo GraphMoE networks. GRAPHMOE employs a recurrent routing strategy to simulate iterative thinking steps, thereby facilitating the flow of information among expert nodes. We implement the GRAPHMOE architecture using Low-Rank Adaptation techniques (LoRA) and conduct extensive experiments on various benchmark datasets. The experimental results reveal that GRAPHMOE outperforms other LoRA based models, achieving state-of-the-art (SOTA) performance. Additionally, this study explores a novel recurrent routing strategy that may inspire further advancements in enhancing the reasoning capabilities of language models.

large language model, machine learning, preprint arxiv, (18 more...)

arXiv.org Artificial Intelligence

2501.0789

Country:

North America > Canada (0.14)
Asia > Thailand (0.14)
North America > United States (0.14)
Asia > China (0.14)

Genre:

Research Report > Promising Solution (0.66)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.66)

Add feedback

Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception

Zhao, Jihao, Ji, Zhiyuan, Feng, Yuchen, Qi, Pengnian, Niu, Simin, Tang, Bo, Xiong, Feiyu, Li, Zhiyu

arXiv.org Artificial IntelligenceNov-25-2024

Retrieval-Augmented Generation (RAG), while serving as a viable complement to large language models (LLMs), often overlooks the crucial aspect of text chunking within its pipeline, which impacts the quality of knowledge-intensive tasks. This paper introduces the concept of Meta-Chunking, which refers to a granularity between sentences and paragraphs, consisting of a collection of sentences within a paragraph that have deep linguistic logical connections. To implement Meta-Chunking, we designed Perplexity (PPL) Chunking, which balances performance and speed, and precisely identifies the boundaries of text chunks by analyzing the characteristics of context perplexity distribution. Additionally, considering the inherent complexity of different texts, we propose a strategy that combines PPL Chunking with dynamic merging to achieve a balance between fine-grained and coarse-grained text chunking. Experiments conducted on eleven datasets demonstrate that Meta-Chunking can more efficiently improve the performance of singlehop and multi-hop question answering based on RAG. For instance, on the 2Wiki-MultihopQA dataset, it outperforms similarity chunking by 1.32 while only consuming 45.8% of the time. Furthermore, through the analysis of models of various scales and types, we observed that PPL Chunking exhibits notable flexibility and adaptability. This is particularly relevant in knowledge-intensive tasks like open-domain question answering (Lazaridou et al., 2022). By integrating two key components: the retriever and the generator, this technology enables more precise responses to input queries (Singh et al., 2021; Lin et al., 2023). While the feasibility of the retrieval-augmentation strategy has been widely demonstrated through practice, its effectiveness heavily relies on the relevance and accuracy of the retrieved documents (Li et al., 2022; Tan et al., 2022). The introduction of excessive redundant or incomplete information through retrieval not only fails to enhance the performance of the generation model but may also lead to a decline in answer quality (Shi et al., 2023; Yan et al., 2024).

chunking, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.12788

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

QAEncoder: Towards Aligned Representation Learning in Question Answering System

Wang, Zhengren, Yu, Qinhan, Wei, Shida, Li, Zhiyu, Xiong, Feiyu, Wang, Xiaoxing, Niu, Simin, Liang, Hao, Zhang, Wentao

arXiv.org Artificial IntelligenceSep-30-2024

Modern QA systems entail retrieval-augmented generation (RAG) for accurate and trustworthy responses. However, the inherent gap between user queries and relevant documents hinders precise matching. Motivated by our conical distribution hypothesis, which posits that potential queries and documents form a cone-like structure in the embedding space, we introduce QAEncoder, a training-free approach to bridge this gap. Specifically, QAEncoder estimates the expectation of potential queries in the embedding space as a robust surrogate for the document embedding, and attaches document fingerprints to effectively distinguish these embeddings. Extensive experiments on fourteen embedding models across six languages and eight datasets validate QAEncoder's alignment capability, which offers a plug-and-play solution that seamlessly integrates with existing RAG architectures and training-based methods.

large language model, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2409.20434

Country:

North America > United States (0.46)
Asia (0.46)

Genre: Research Report (0.82)

Industry:

Government > Space Agency (0.46)
Government > Regional Government > North America Government > United States Government (0.46)
Media (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

$\text{Memory}^3$: Language Modeling with Explicit Memory

Yang, Hongkang, Lin, Zehao, Wang, Wenjin, Wu, Hao, Li, Zhiyu, Tang, Bo, Wei, Wenqiang, Wang, Jinbo, Tang, Zeyun, Song, Shichao, Xi, Chenyang, Yu, Yu, Chen, Kai, Xiong, Feiyu, Tang, Linpeng, E, Weinan

arXiv.org Artificial IntelligenceJul-1-2024

The training and inference of large language models (LLMs) are together a costly process that transports knowledge from raw data to meaningful computation. Inspired by the memory hierarchy of the human brain, we reduce this cost by equipping LLMs with explicit memory, a memory format cheaper than model parameters and text retrieval-augmented generation (RAG). Conceptually, with most of its knowledge externalized to explicit memories, the LLM can enjoy a smaller parameter size, training cost, and inference cost, all proportional to the amount of remaining "abstract knowledge". As a preliminary proof of concept, we train from scratch a 2.4B LLM, which achieves better performance than much larger LLMs as well as RAG models, and maintains higher decoding speed than RAG. The model is named $\text{Memory}^3$, since explicit memory is the third form of memory in LLMs after implicit memory (model parameters) and working memory (context key-values). We introduce a memory circuitry theory to support the externalization of knowledge, and present novel techniques including a memory sparsification mechanism that makes storage tractable and a two-stage pretraining scheme that facilitates memory formation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2407.01178

Country:

Asia > China (0.28)
North America > United States > Hawaii (0.14)

Genre: Research Report > Promising Solution (0.47)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

FastMem: Fast Memorization of Prompt Improves Context Awareness of Large Language Models

Zhu, Junyi, Liu, Shuochen, Yu, Yu, Tang, Bo, Yan, Yibo, Li, Zhiyu, Xiong, Feiyu, Xu, Tong, Blaschko, Matthew B.

arXiv.org Artificial IntelligenceJun-23-2024

Large language models (LLMs) excel in generating coherent text, but they often struggle with context awareness, leading to inaccuracies in tasks requiring faithful adherence to provided information. We introduce FastMem, a novel method designed to enhance instruction fine-tuned LLMs' context awareness through fast memorization of the prompt. FastMem maximizes the likelihood of the prompt before inference by fine-tuning only the last Feed-Forward Network (FFN) module. This targeted approach ensures efficient optimization without overfitting, significantly improving the model's ability to comprehend and accurately follow the context. Our experiments demonstrate substantial gains in reading comprehension, text summarization and adherence to output structures. For instance, FastMem improves the accuracy of Llama 3-8B-Inst on the NQ-SWAP dataset from 59.1% to 71.6%, and reduces the output structure failure rate of Qwen 1.5-4B-Chat from 34.9% to 25.5%. Extensive experimental results highlight FastMem's potential to offer a robust solution to enhance the reliability and accuracy of LLMs in various applications. Our code is available at: https://github.com/IAAR-Shanghai/FastMem

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.16069

Country:

North America > United States (1.00)
Asia > China > Shanghai > Shanghai (0.24)

Genre: Research Report > Promising Solution (0.48)

Industry:

Leisure & Entertainment (1.00)
Government > Military (1.00)
Media > Film (0.68)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (0.63)

Add feedback