Goto

Collaborating Authors

 knowledge selection


609c5e5089a9aa967232aba2a4d03114-AuthorFeedback.pdf

Neural Information Processing Systems

Effect of UniLM: We observe obvious performance drop when using fine-tuned UniLM2 with fixed top-1 retrieved knowledge (-parameterized posterior in Table 3). (aligned with the columns of Table 1). Our model performs implicit knowledge selection on the input K14 knowledgesentences(concatenatedinasequence)inanend-to-endwaylikeDRD[52]. F1 on the validation set increases until the number of knowledge reaches 10, but22 stays stable when the number increases from 10 to 30. Intesttime,knowledge28 selection module is mainly to shorten the input sequence of knowledge candidates, so the performance drop is not29 significant.



Review for NeurIPS paper: Zero-Resource Knowledge-Grounded Dialogue Generation

Neural Information Processing Systems

Weaknesses: - It is hard to judge whether the proposed method gains good results because of the proposed learning method or the help of the strong pretrained UniLM model. Even though they compare it with DialoGPT in the appendix, I also would like to see the model's performance without UniLM initialization or finetuned DialoGPT with the proposed dataset (e.g., Reddit conversation with top-1 retrieved knowledge). How do you select knowledge for ITDD? (ii) All the examples and details of human evaluation say that authors use ground-truth knowledge. Are all the models use GT knowledge in test time or use top-10 retrieved knowledge from Lucene knowledge retriever? If so, the performance of some baselines would be revised.


A Systematic Investigation of Knowledge Retrieval and Selection for Retrieval Augmented Generation

Li, Xiangci, Ouyang, Jessica

arXiv.org Artificial Intelligence

Retrieval-augmented generation (RAG) has emerged as a powerful method for enhancing natural language generation by integrating external knowledge into a model's output. While prior work has demonstrated the importance of improving knowledge retrieval for boosting generation quality, the role of knowledge selection remains less clear. In this paper, we perform a comprehensive analysis of how knowledge retrieval and selection influence downstream generation performance in RAG systems. By simulating different retrieval and selection conditions through a controlled mixture of gold and distractor knowledge, we assess the impact of these factors on generation outcomes. Our findings indicate that the downstream generator model's capability, as well as the complexity of the task and dataset, significantly influence the impact of knowledge retrieval and selection on the overall RAG system performance. In typical scenarios, improving the knowledge recall score is key to enhancing generation outcomes, with the knowledge selector providing a limited additional benefit when a strong generator model is used on clear, well-defined tasks. For weaker generator models or more ambiguous tasks and datasets, the knowledge F1 score becomes a critical factor, and the knowledge selector plays a more prominent role in improving overall performance.


A Knowledge Plug-and-Play Test Bed for Open-domain Dialogue Generation

Li, Xiangci, Song, Linfeng, Jin, Lifeng, Mi, Haitao, Ouyang, Jessica, Yu, Dong

arXiv.org Artificial Intelligence

Knowledge-based, open-domain dialogue generation aims to build chit-chat systems that talk to humans using mined support knowledge. Many types and sources of knowledge have previously been shown to be useful as support knowledge. Even in the era of large language models, response generation grounded in knowledge retrieved from additional up-to-date sources remains a practically important approach. While prior work using single-source knowledge has shown a clear positive correlation between the performances of knowledge selection and response generation, there are no existing multi-source datasets for evaluating support knowledge retrieval. Further, prior work has assumed that the knowledge sources available at test time are the same as during training. This unrealistic assumption unnecessarily handicaps models, as new knowledge sources can become available after a model is trained. In this paper, we present a high-quality benchmark named multi-source Wizard of Wikipedia (Ms.WoW) for evaluating multi-source dialogue knowledge selection and response generation. Unlike existing datasets, it contains clean support knowledge, grounded at the utterance level and partitioned into multiple knowledge sources. We further propose a new challenge, dialogue knowledge plug-and-play, which aims to test an already trained dialogue model on using new support knowledge from previously unseen sources in a zero-shot fashion.


CET2: Modelling Topic Transitions for Coherent and Engaging Knowledge-Grounded Conversations

Xu, Lin, Zhou, Qixian, Fu, Jinlan, Ng, See-Kiong

arXiv.org Artificial Intelligence

Knowledge-grounded dialogue systems aim to generate coherent and engaging responses based on the dialogue contexts and selected external knowledge. Previous knowledge selection methods tend to rely too heavily on the dialogue contexts or over-emphasize the new information in the selected knowledge, resulting in the selection of repetitious or incongruous knowledge and further generating repetitive or incoherent responses, as the generation of the response depends on the chosen knowledge. To address these shortcomings, we introduce a Coherent and Engaging Topic Transition (CET2) framework to model topic transitions for selecting knowledge that is coherent to the context of the conversations while providing adequate knowledge diversity for topic development. Our CET2 framework considers multiple factors for knowledge selection, including valid transition logic from dialogue contexts to the following topics and systematic comparisons between available knowledge candidates. Extensive experiments on two public benchmarks demonstrate the superiority and the better generalization ability of CET2 on knowledge selection. This is due to our well-designed transition features and comparative knowledge selection strategy, which are more transferable to conversations about unseen topics. Analysis of fine-grained knowledge selection accuracy also shows that CET2 can better balance topic entailment (contextual coherence) and development (knowledge diversity) in dialogue than existing approaches.


UniRetriever: Multi-task Candidates Selection for Various Context-Adaptive Conversational Retrieval

Wang, Hongru, Xue, Boyang, Zhou, Baohang, Wang, Rui, Mi, Fei, Wang, Weichao, Wang, Yasheng, Wong, Kam-Fai

arXiv.org Artificial Intelligence

Conversational retrieval refers to an information retrieval system that operates in an iterative and interactive manner, requiring the retrieval of various external resources, such as persona, knowledge, and even response, to effectively engage with the user and successfully complete the dialogue. However, most previous work trained independent retrievers for each specific resource, resulting in sub-optimal performance and low efficiency. Thus, we propose a multi-task framework function as a universal retriever for three dominant retrieval tasks during the conversation: persona selection, knowledge selection, and response selection. To this end, we design a dual-encoder architecture consisting of a context-adaptive dialogue encoder and a candidate encoder, aiming to attention to the relevant context from the long dialogue and retrieve suitable candidates by simply a dot product. Furthermore, we introduce two loss constraints to capture the subtle relationship between dialogue context and different candidates by regarding historically selected candidates as hard negatives. Extensive experiments and analysis establish state-of-the-art retrieval quality both within and outside its training domain, revealing the promising potential and generalization capability of our model to serve as a universal retriever for different candidate selection tasks simultaneously.


Well Begun is Half Done: Generator-agnostic Knowledge Pre-Selection for Knowledge-Grounded Dialogue

Qin, Lang, Zhang, Yao, Liang, Hongru, Wang, Jun, Yang, Zhenglu

arXiv.org Artificial Intelligence

Accurate knowledge selection is critical in knowledge-grounded dialogue systems. Towards a closer look at it, we offer a novel perspective to organize existing literature, i.e., knowledge selection coupled with, after, and before generation. We focus on the third under-explored category of study, which can not only select knowledge accurately in advance, but has the advantage to reduce the learning, adjustment, and interpretation burden of subsequent response generation models, especially LLMs. We propose GATE, a generator-agnostic knowledge selection method, to prepare knowledge for subsequent response generation models by selecting context-related knowledge among different knowledge structures and variable knowledge requirements. Experimental results demonstrate the superiority of GATE, and indicate that knowledge selection before generation is a lightweight yet effective way to facilitate LLMs (e.g., ChatGPT) to generate more informative responses.


Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference

Xu, Yan, Kong, Deqian, Xu, Dehong, Ji, Ziwei, Pang, Bo, Fung, Pascale, Wu, Ying Nian

arXiv.org Artificial Intelligence

The capability to generate responses with diversity and faithfulness using factual knowledge is paramount for creating a human-like, trustworthy dialogue system. Common strategies either adopt a two-step paradigm, which optimizes knowledge selection and response generation separately, and may overlook the inherent correlation between these two tasks, or leverage conditional variational method to jointly optimize knowledge selection and response generation by employing an inference network. In this paper, we present an end-to-end learning framework, termed Sequential Posterior Inference (SPI), capable of selecting knowledge and generating dialogues by approximately sampling from the posterior distribution. Unlike other methods, SPI does not require the inference network or assume a simple geometry of the posterior distribution. This straightforward and intuitive inference procedure of SPI directly queries the response generation model, allowing for accurate knowledge selection and generation of faithful responses. In addition to modeling contributions, our experimental results on two common dialogue datasets (Wizard of Wikipedia and Holl-E) demonstrate that SPI outperforms previous strong baselines according to both automatic and human evaluation metrics.


Query Enhanced Knowledge-Intensive Conversation via Unsupervised Joint Modeling

Cai, Mingzhu, Bao, Siqi, Tian, Xin, He, Huang, Wang, Fan, Wu, Hua

arXiv.org Artificial Intelligence

In this paper, we propose an unsupervised query enhanced approach for knowledge-intensive conversations, namely QKConv. There are three modules in QKConv: a query generator, an off-the-shelf knowledge selector, and a response generator. QKConv is optimized through joint training, which produces the response by exploring multiple candidate queries and leveraging corresponding selected knowledge. The joint training solely relies on the dialogue context and target response, getting exempt from extra query annotations or knowledge provenances. To evaluate the effectiveness of the proposed QKConv, we conduct experiments on three representative knowledge-intensive conversation datasets: conversational question-answering, task-oriented dialogue, and knowledge-grounded conversation. Experimental results reveal that QKConv performs better than all unsupervised methods across three datasets and achieves competitive performance compared to supervised methods.