AITopics | Chen, Zhumin

Plotting

Chen, Zhumin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

KnowTuning: Knowledge-aware Fine-tuning for Large Language Models

Lyu, Yougang, Yan, Lingyong, Wang, Shuaiqiang, Shi, Haibo, Yin, Dawei, Ren, Pengjie, Chen, Zhumin, de Rijke, Maarten, Ren, Zhaochun

arXiv.org Artificial IntelligenceApr-17-2024

Despite their success at many natural language processing (NLP) tasks, large language models still struggle to effectively leverage knowledge for knowledge-intensive tasks, manifesting limitations such as generating incomplete, non-factual, or illogical answers. These limitations stem from inadequate knowledge awareness of LLMs during vanilla fine-tuning. To address these problems, we propose a knowledge-aware fine-tuning (KnowTuning) method to improve fine-grained and coarse-grained knowledge awareness of LLMs. We devise a fine-grained knowledge augmentation stage to train LLMs to identify difficult fine-grained knowledge in answers. We also propose a coarse-grained knowledge comparison stage to train LLMs to distinguish between reliable and unreliable knowledge, in three aspects: completeness, factuality, and logicality. Extensive experiments on both generic and medical question answering (QA) datasets confirm the effectiveness of KnowTuning, through automatic and human evaluations, across various sizes of LLMs. We further verify that KnowTuning generates more facts with less factual error rate under fine-grained facts evaluation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.11176

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.67)

Genre: Research Report > New Finding (0.93)

Industry:

Government > Regional Government > North America Government > United States Government (0.94)
Media > Film (0.93)
Leisure & Entertainment > Sports > Football (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Offline Trajectory Generalization for Offline Reinforcement Learning

Zhao, Ziqi, Ren, Zhaochun, Yang, Liu, Yuan, Fajie, Ren, Pengjie, Chen, Zhumin, Ma, jun, Xin, Xin

arXiv.org Artificial IntelligenceApr-16-2024

Offline reinforcement learning (RL) aims to learn policies from static datasets of previously collected trajectories. Existing methods for offline RL either constrain the learned policy to the support of offline data or utilize model-based virtual environments to generate simulated rollouts. However, these methods suffer from (i) poor generalization to unseen states; and (ii) trivial improvement from low-qualified rollout simulation. In this paper, we propose offline trajectory generalization through world transformers for offline reinforcement learning (OTTO). Specifically, we use casual Transformers, a.k.a. World Transformers, to predict state dynamics and the immediate reward. Then we propose four strategies to use World Transformers to generate high-rewarded trajectory simulation by perturbing the offline data. Finally, we jointly use offline data with simulated data to train an offline RL algorithm. OTTO serves as a plug-in module and can be integrated with existing offline RL methods to enhance them with better generalization capability of transformers and high-rewarded data augmentation. Conducting extensive experiments on D4RL benchmark datasets, we verify that OTTO significantly outperforms state-of-the-art offline RL methods.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2404.10393

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Generative Retrieval as Multi-Vector Dense Retrieval

Wu, Shiguang, Wei, Wenda, Zhang, Mengqi, Chen, Zhumin, Ma, Jun, Ren, Zhaochun, de Rijke, Maarten, Ren, Pengjie

arXiv.org Artificial IntelligenceMar-31-2024

Generative retrieval generates identifiers of relevant documents in an end-to-end manner using a sequence-to-sequence architecture for a given query. The relation between generative retrieval and other retrieval methods, especially those based on matching within dense retrieval models, is not yet fully comprehended. Prior work has demonstrated that generative retrieval with atomic identifiers is equivalent to single-vector dense retrieval. Accordingly, generative retrieval exhibits behavior analogous to hierarchical search within a tree index in dense retrieval when using hierarchical semantic identifiers. However, prior work focuses solely on the retrieval stage without considering the deep interactions within the decoder of generative retrieval. In this paper, we fill this gap by demonstrating that generative retrieval and multi-vector dense retrieval share the same framework for measuring the relevance to a query of a document. Specifically, we examine the attention layer and prediction head of generative retrieval, revealing that generative retrieval can be understood as a special case of multi-vector dense retrieval. Both methods compute relevance as a sum of products of query and document vectors and an alignment matrix. We then explore how generative retrieval applies this framework, employing distinct strategies for computing document token vectors and the alignment matrix. We have conducted experiments to verify our conclusions and show that both paradigms exhibit commonalities of term matching in their alignment matrix.

alignment matrix, information retrieval, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2404.00684

Country:

Europe (1.00)
Asia > China > Shandong Province (0.14)
North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)

Add feedback

Knowledge Graph Enhanced Large Language Model Editing

Zhang, Mengqi, Ye, Xiaotian, Liu, Qiang, Ren, Pengjie, Wu, Shu, Chen, Zhumin

arXiv.org Artificial IntelligenceFeb-21-2024

Large language models (LLMs) are pivotal in advancing natural language processing (NLP) tasks, yet their efficacy is hampered by inaccuracies and outdated knowledge. Model editing emerges as a promising solution to address these challenges. However, existing editing methods struggle to track and incorporate changes in knowledge associated with edits, which limits the generalization ability of postedit LLMs in processing edited knowledge. To tackle these problems, we propose a novel model editing method that leverages knowledge graphs for enhancing LLM editing, namely GLAME. Specifically, we first utilize a knowledge graph augmentation module to uncover associated knowledge that has changed due to editing, obtaining its internal representations within LLMs. This approach allows knowledge alterations within LLMs to be reflected through an external graph structure. Subsequently, we design a graph-based knowledge edit module to integrate structured knowledge into the model editing. This ensures that the updated parameters reflect not only the modifications of the edited knowledge but also the changes in other associated knowledge resulting from the editing process. Comprehensive experiments conducted on GPT-J and GPT-2 XL demonstrate that GLAME significantly improves the generalization capabilities of post-edit LLMs in employing edited knowledge.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2402.13593

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.68)

Industry: Leisure & Entertainment > Sports (0.95)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Zero-Shot Position Debiasing for Large Language Models

Liu, Zhongkun, Chen, Zheng, Zhang, Mengqi, Ren, Zhaochun, Chen, Zhumin, Ren, Pengjie

arXiv.org Artificial IntelligenceJan-2-2024

Fine-tuning has been demonstrated to be an effective method to improve the domain performance of large language models (LLMs). However, LLMs might fit the dataset bias and shortcuts for prediction, leading to poor generation performance. Experimental result shows that LLMs are prone to exhibit position bias, i.e., leveraging information positioned at the beginning or end, or specific positional cues within the input. Existing works on mitigating position bias require external bias knowledge or annotated non-biased samples, which is unpractical in reality. In this work, we propose a zero-shot position debiasing (ZOE) framework to mitigate position bias for LLMs. ZOE leverages unsupervised responses from pre-trained LLMs for debiasing, thus without any external knowledge or datasets. To improve the quality of unsupervised responses, we propose a master-slave alignment (MSA) module to prune these responses. Experiments on eight datasets and five tasks show that ZOE consistently outperforms existing methods in mitigating four types of position biases. Besides, ZOE achieves this by sacrificing only a small performance on biased samples, which is simple and effective.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2401.01218

Country: Asia (0.46)

Genre: Research Report (0.70)

Industry: Leisure & Entertainment > Sports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

On the Effectiveness of Unlearning in Session-Based Recommendation

Xin, Xin, Yang, Liu, Zhao, Ziqi, Ren, Pengjie, Chen, Zhumin, Ma, Jun, Ren, Zhaochun

arXiv.org Artificial IntelligenceDec-22-2023

Session-based recommendation predicts users' future interests from previous interactions in a session. Despite the memorizing of historical samples, the request of unlearning, i.e., to remove the effect of certain training samples, also occurs for reasons such as user privacy or model fidelity. However, existing studies on unlearning are not tailored for the session-based recommendation. On the one hand, these approaches cannot achieve satisfying unlearning effects due to the collaborative correlations and sequential connections between the unlearning item and the remaining items in the session. On the other hand, seldom work has conducted the research to verify the unlearning effectiveness in the session-based recommendation scenario. In this paper, we propose SRU, a session-based recommendation unlearning framework, which enables high unlearning efficiency, accurate recommendation performance, and improved unlearning effectiveness in session-based recommendation. Specifically, we first partition the training sessions into separate sub-models according to the similarity across the sessions, then we utilize an attention-based aggregation layer to fuse the hidden states according to the correlations between the session and the centroid of the data in the sub-model. To improve the unlearning effectiveness, we further propose three extra data deletion strategies, including collaborative extra deletion (CED), neighbor extra deletion (NED), and random extra deletion (RED). Besides, we propose an evaluation metric that measures whether the unlearning sample can be inferred after the data deletion to verify the unlearning effectiveness. We implement SRU with three representative session-based recommendation models and conduct experiments on three benchmark datasets. Experimental results demonstrate the effectiveness of our methods.

artificial intelligence, information management, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3616855.3635823

2312.14447

Country:

Europe (0.68)
North America > United States (0.47)
Asia > China > Shandong Province (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.93)
(2 more...)

Add feedback

Confucius: Iterative Tool Learning from Introspection Feedback by Easy-to-Difficult Curriculum

Gao, Shen, Shi, Zhengliang, Zhu, Minghang, Fang, Bowen, Xin, Xin, Ren, Pengjie, Chen, Zhumin, Ma, Jun, Ren, Zhaochun

arXiv.org Artificial IntelligenceDec-21-2023

Augmenting large language models (LLMs) with external tools has emerged as a promising approach to extending the capability of LLMs. Although some works employ open-source LLMs for the tool learning task, most of them are trained in a controlled environment in which LLMs only learn to execute the human-provided tools. However, selecting proper tools from the large toolset is also a crucial ability for the tool learning model to be applied in real-world applications. Existing methods usually directly employ self-instruction methods to train the model, which ignores differences in tool complexity. In this paper, we propose the Confucius, a novel tool learning framework to train LLM to use complicated tools in real-world scenarios, which contains two main phases: (1) We first propose a multi-stage learning method to teach the LLM to use various tools from an easy-to-difficult curriculum; (2) thenceforth, we propose the Iterative Self-instruct from Introspective Feedback (ISIF) to dynamically construct the dataset to improve the ability to use the complicated tool. Extensive experiments conducted on both controlled and real-world settings demonstrate the superiority of our tool learning framework in the real-world application scenarios compared to both tuning-free (e.g. ChatGPT, Claude) and tuning-based baselines (e.g. GPT4Tools).

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2308.14034

Country:

Europe (0.47)
Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Multi-Defendant Legal Judgment Prediction via Hierarchical Reasoning

Lyu, Yougang, Hao, Jitai, Wang, Zihan, Zhao, Kai, Gao, Shen, Ren, Pengjie, Chen, Zhumin, Wang, Fang, Ren, Zhaochun

arXiv.org Artificial IntelligenceDec-9-2023

Multiple defendants in a criminal fact description generally exhibit complex interactions, and cannot be well handled by existing Legal Judgment Prediction (LJP) methods which focus on predicting judgment results (e.g., law articles, charges, and terms of penalty) for single-defendant cases. To address this problem, we propose the task of multi-defendant LJP, which aims to automatically predict the judgment results for each defendant of multi-defendant cases. Two challenges arise with the task of multi-defendant LJP: (1) indistinguishable judgment results among various defendants; and (2) the lack of a real-world dataset for training and evaluation. To tackle the first challenge, we formalize the multi-defendant judgment process as hierarchical reasoning chains and introduce a multi-defendant LJP method, named Hierarchical Reasoning Network (HRN), which follows the hierarchical reasoning chains to determine criminal relationships, sentencing circumstances, law articles, charges, and terms of penalty for each defendant. To tackle the second challenge, we collect a real-world multi-defendant LJP dataset, namely MultiLJP, to accelerate the relevant research in the future. Extensive experiments on MultiLJP verify the effectiveness of our proposed HRN.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2312.05762

Country:

North America > United States (0.28)
Asia > China > Shandong Province (0.14)

Genre: Research Report > New Finding (0.68)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Instruction Distillation Makes Large Language Models Efficient Zero-shot Rankers

Sun, Weiwei, Chen, Zheng, Ma, Xinyu, Yan, Lingyong, Wang, Shuaiqiang, Ren, Pengjie, Chen, Zhumin, Yin, Dawei, Ren, Zhaochun

arXiv.org Artificial IntelligenceNov-2-2023

Recent studies have demonstrated the great potential of Large Language Models (LLMs) serving as zero-shot relevance rankers. The typical approach involves making comparisons between pairs or lists of documents. Although effective, these listwise and pairwise methods are not efficient and also heavily rely on intricate prompt engineering. To tackle this problem, we introduce a novel instruction distillation method. The key idea is to distill the pairwise ranking ability of open-sourced LLMs to a simpler but more efficient pointwise ranking. Specifically, given the same LLM, we first rank documents using the effective pairwise approach with complex instructions, and then distill the teacher predictions to the pointwise approach with simpler instructions. Evaluation results on the BEIR, TREC, and ReDial datasets demonstrate that instruction distillation can improve efficiency by 10 to 100x and also enhance the ranking performance of LLMs. Furthermore, our approach surpasses the performance of existing supervised methods like monoT5 and is on par with the state-of-the-art zero-shot methods. The code to reproduce our results is available at www.github.com/sunnweiwei/RankGPT.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2311.01555

Country:

Europe (0.28)
Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Metaphorical User Simulators for Evaluating Task-oriented Dialogue Systems

Sun, Weiwei, Guo, Shuyu, Zhang, Shuo, Ren, Pengjie, Chen, Zhumin, de Rijke, Maarten, Ren, Zhaochun

arXiv.org Artificial IntelligenceNov-2-2023

Task-oriented dialogue systems (TDSs) are assessed mainly in an offline setting or through human evaluation. The evaluation is often limited to single-turn or is very time-intensive. As an alternative, user simulators that mimic user behavior allow us to consider a broad set of user goals to generate human-like conversations for simulated evaluation. Employing existing user simulators to evaluate TDSs is challenging as user simulators are primarily designed to optimize dialogue policies for TDSs and have limited evaluation capabilities. Moreover, the evaluation of user simulators is an open challenge. In this work, we propose a metaphorical user simulator for end-to-end TDS evaluation, where we define a simulator to be metaphorical if it simulates user's analogical thinking in interactions with systems. We also propose a tester-based evaluation framework to generate variants, i.e., dialogue systems with different capabilities. Our user simulator constructs a metaphorical user model that assists the simulator in reasoning by referring to prior knowledge when encountering new items. We estimate the quality of simulators by checking the simulated interactions between simulators and variants. Our experiments are conducted using three TDS datasets. The proposed user simulator demonstrates better consistency with manual evaluation than an agenda-based simulator and a seq2seq model on three datasets; our tester framework demonstrates efficiency and has been tested on multiple tasks, such as conversational recommendation and e-commerce dialogues.

information retrieval, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2204.00763

Country: Europe > Netherlands (0.28)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Restaurants (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.94)
Information Technology > Information Management (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.94)
(2 more...)

Add feedback