AITopics | dialog history

Collaborating Authors

dialog history

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

Jiasen Lu, Anitha Kannan, Jianwei Yang, Devi Parikh, Dhruv Batra

Neural Information Processing SystemsNov-21-2025, 04:14:15 GMT

Work was done while at Facebook AI Research.

arxiv preprint arxiv, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Building Open-Retrieval Conversational Question Answering Systems by Generating Synthetic Data and Decontextualizing User Questions

Vlachos, Christos, Stylianou, Nikolaos, Fiotaki, Alexandra, Methenitis, Spiros, Palogiannidi, Elisavet, Stafylakis, Themos, Androutsopoulos, Ion

arXiv.org Artificial IntelligenceJul-8-2025

We consider open-retrieval conversational question answering (OR-CONVQA), an extension of question answering where system responses need to be (i) aware of dialog history and (ii) grounded in documents (or document fragments) retrieved per question. Domain-specific OR-CONVQA training datasets are crucial for real-world applications, but hard to obtain. We propose a pipeline that capitalizes on the abundance of plain text documents in organizations (e.g., product documentation) to automatically produce realistic OR-CONVQA dialogs with annotations. Similarly to real-world humanannotated OR-CONVQA datasets, we generate in-dialog question-answer pairs, self-contained (decontextualized, e.g., no referring expressions) versions of user questions, and propositions (sentences expressing prominent information from the documents) the system responses are grounded in. We show how the synthetic dialogs can be used to train efficient question rewriters that decontextualize user questions, allowing existing dialog-unaware retrievers to be utilized. The retrieved information and the decontextualized question are then passed on to an LLM that generates the system's response.

large language model, natural language, question answering, (17 more...)

arXiv.org Artificial Intelligence

2507.04884

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
Europe > Greece (0.04)
(15 more...)

Genre: Research Report (0.64)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling

Zheng, Hang, Xu, Hongshen, Liu, Yuncong, Chen, Lu, Fung, Pascale, Yu, Kai

arXiv.org Artificial IntelligenceMar-12-2025

Large language models (LLMs) frequently hallucinate due to misaligned self-awareness, generating erroneous outputs when addressing queries beyond their knowledge boundaries. While existing approaches mitigate hallucinations via uncertainty estimation or query rejection, they suffer from computational inefficiency or sacrificed helpfulness. To address these issues, we propose the Explicit Knowledge Boundary Modeling (EKBM) framework, integrating fast and slow reasoning systems to harmonize reliability and usability. The framework first employs a fast-thinking model to generate confidence-labeled responses, enabling immediate use of high-confidence outputs. For uncertain predictions, a slow refinement model conducts targeted reasoning to improve accuracy. To align model behavior with our proposed object, we propose a hybrid training pipeline, enhancing self-awareness without degrading task performance. Evaluations on dialogue state tracking tasks demonstrate that EKBM achieves superior model reliability over uncertainty-based baselines. Further analysis reveals that refinement substantially boosts accuracy while maintaining low computational overhead. Our work establishes a scalable paradigm for advancing LLM reliability and balancing accuracy and practical utility in error-sensitive applications.

arxiv preprint arxiv, prediction, refinement, (13 more...)

arXiv.org Artificial Intelligence

2503.02233

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generative Subgraph Retrieval for Knowledge Graph-Grounded Dialog Generation

Park, Jinyoung, Joo, Minseok, Kim, Joo-Kyung, Kim, Hyunwoo J.

arXiv.org Artificial IntelligenceOct-11-2024

Knowledge graph-grounded dialog generation requires retrieving a dialog-relevant subgraph from the given knowledge base graph and integrating it with the dialog history. Previous works typically represent the graph using an external encoder, such as graph neural networks, and retrieve relevant triplets based on the similarity between single-vector representations of triplets and the dialog history. However, these external encoders fail to leverage the rich knowledge of pretrained language models, and the retrieval process is also suboptimal due to the information bottleneck caused by the single-vector abstraction of the dialog history. In this work, we propose Dialog generation with Generative Subgraph Retrieval (DialogGSR), which retrieves relevant knowledge subgraphs by directly generating their token sequences on top of language models. For effective generative subgraph retrieval, we introduce two key methods: (i) structure-aware knowledge graph linearization with self-supervised graph-specific tokens and (ii) graph-constrained decoding utilizing graph structural proximity-based entity informativeness scores for valid and relevant generative retrieval. DialogGSR achieves state-of-the-art performance in knowledge graph-grounded dialog generation, as demonstrated on OpenDialKG and KOMODIS datasets.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2410.0935

Country: South America > Argentina (0.04)

Genre: Research Report > New Finding (0.93)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Reviews: Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base

Neural Information Processing SystemsOct-8-2024, 05:41:02 GMT

This paper proposes a semantic parsing method for dialog-based QA over a large-scale knowledge base. The method significantly outperforms the existing state of the art on CSQA, a recently-released conversational QA dataset. One of the major novelties of this paper is breaking apart the logical forms in the dialog history into smaller subsequences, any of which can be copied over into the logical form for the current question. While I do have some concerns with the method and the writing (detailed below), overall I liked this paper and I think that some of the ideas within it could be useful more broadly for QA researchers. Detailed comments: - I found many parts of the paper to be confusing, requiring multiple reads to fully understand.

dialog memory, large-scale knowledge base, replication action, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.62)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.57)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.40)

Add feedback

Best of Both Worlds: Transferring Knowledge from Discriminative Learning to a Generative Visual Dialog Model

Jiasen Lu, Anitha Kannan, Jianwei Yang, Devi Parikh, Dhruv Batra

Neural Information Processing SystemsOct-2-2024, 16:55:31 GMT

We present a novel training framework for neural sequence models, particularly for grounded dialog generation. The standard training paradigm for these models is maximum likelihood estimation (MLE), or minimizing the cross-entropy of the human responses. Across a variety of domains, a recurring problem with MLE trained generative neural dialog models (G) is that they tend to produce'safe' and generic responses ('I don't know', 'I can't tell'). In contrast, discriminative dialog models (D) that are trained to rank a list of candidate human responses outperform their generative counterparts; in terms of automatic metrics, diversity, and informativeness of the responses. However, D is not useful in practice since it can not be deployed to have real conversations with users. Our work aims to achieve the best of both worlds - the practical usefulness of G and the strong performance of D - via knowledge transfer from D to G. Our primary contribution is an end-to-end trainable generative visual dialog model, where G receives gradients from D as a perceptual (not adversarial) loss of the sequence sampled from G. We leverage the recently proposed Gumbel-Softmax (GS) approximation to the discrete distribution - specifically, a RNN augmented with a sequence of GS samplers, coupled with the straight-through gradient estimator to enable end-to-end differentiability. We also introduce a stronger encoder for visual dialog, and employ a self-attention mechanism for answer encoding along with a metric learning loss to aid D in better capturing semantic similarities in answer responses. Overall, our proposed model outperforms state-of-the-art on the VisDial dataset by a significant margin (2.67% on recall@10).

arxiv preprint arxiv, dialog model, discriminator, (14 more...)

Neural Information Processing Systems

Country: North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Enhancing Visual Dialog State Tracking through Iterative Object-Entity Alignment in Multi-Round Conversations

Pang, Wei, Duan, Ruixue, Yang, Jinfu, Li, Ning

arXiv.org Artificial IntelligenceAug-13-2024

Visual Dialog (VD) is a task where an agent answers a series of image-related questions based on a multi-round dialog history. However, previous VD methods often treat the entire dialog history as a simple text input, disregarding the inherent conversational information flows at the round level. In this paper, we introduce Multi-round Dialogue State Tracking model (MDST), a framework that addresses this limitation by leveraging the dialogue state learned from dialog history to answer questions. MDST captures each round of dialog history, constructing internal dialogue state representations defined as 2-tuples of vision-language representations. These representations effectively ground the current question, enabling the generation of accurate answers. Experimental results on the VisDial v1.0 dataset demonstrate that MDST achieves a new state-of-the-art performance in generative setting. Furthermore, through a series of human studies, we validate the effectiveness of MDST in generating long, consistent, and human-like answers while consistently answering a series of questions correctly.

dialog history, dialogue state, representation, (11 more...)

arXiv.org Artificial Intelligence

doi: 10.1049/cit2.12370

2408.06725

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Efficient Data Generation for Source-grounded Information-seeking Dialogs: A Use Case for Meeting Transcripts

Golany, Lotem, Galgani, Filippo, Mamo, Maya, Parasol, Nimrod, Vandsburger, Omer, Bar, Nadav, Dagan, Ido

arXiv.org Artificial IntelligenceJun-21-2024

Automating data generation with Large Language Models (LLMs) has become increasingly popular. In this work, we investigate the feasibility and effectiveness of LLM-based data generation in the challenging setting of source-grounded information-seeking dialogs, with response attribution, over long documents. Our source texts consist of long and noisy meeting transcripts, adding to the task complexity. Since automating attribution remains difficult, we propose a semi-automatic approach: dialog queries and responses are generated with LLMs, followed by human verification and identification of attribution spans. Using this approach, we created MISeD -- Meeting Information Seeking Dialogs dataset -- a dataset of information-seeking dialogs focused on meeting transcripts. Models finetuned with MISeD demonstrate superior performance compared to off-the-shelf models, even those of larger size. Finetuning on MISeD gives comparable response generation quality to finetuning on fully manual data, while improving attribution quality and reducing time and effort.

attribution, query, transcript, (15 more...)

arXiv.org Artificial Intelligence

2405.01121

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games

Zhu, Qinglin, Zhao, Runcong, Du, Jinhua, Gui, Lin, He, Yulan

arXiv.org Artificial IntelligenceJun-17-2024

We propose PLAYER*, a novel framework that addresses the limitations of existing agent-based approaches built on Large Language Models (LLMs) in handling complex questions and understanding interpersonal relationships in dynamic environments. PLAYER* enhances path planning in Murder Mystery Games (MMGs) using an anytime sampling-based planner and a questioning-driven search framework. By equipping agents with a set of sensors, PLAYER* eliminates the need for pre-defined questions and enables agents to navigate complex social interactions. We additionally make a contribution by introducing a quantifiable evaluation method using multiple-choice questions and present WellPlay, a dataset containing 1,482 question-answer pairs. Experimental results demonstrate PLAYER*'s superiority over existing multi-agent methods, enhancing the generalisability and adaptability of agents in MMGs and paving the way for more effective multi-agent interactions.

agent, murderer, victim, (16 more...)

arXiv.org Artificial Intelligence

2404.17662

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
North America > Montserrat (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Games (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Retrieval Augmented End-to-End Spoken Dialog Models

Wang, Mingqiu, Shafran, Izhak, Soltau, Hagen, Han, Wei, Cao, Yuan, Yu, Dian, Shafey, Laurent El

arXiv.org Artificial IntelligenceFeb-2-2024

We recently developed SLM, a joint speech and language model, which fuses a pretrained foundational speech model and a large language model (LLM), while preserving the in-context learning capability intrinsic to the pretrained LLM. In this paper, we apply SLM to speech dialog applications where the dialog states are inferred directly from the audio signal. Task-oriented dialogs often contain domain-specific entities, i.e., restaurants, hotels, train stations, and city names, which are difficult to recognize, however, critical for the downstream applications. Inspired by the RAG (retrieval-augmented generation) paradigm, we propose a retrieval augmented SLM (ReSLM) that overcomes this weakness. We first train a speech retriever to retrieve text entities mentioned in the audio. The retrieved entities are then added as text inputs to the underlying SLM to bias model predictions. We evaluated ReSLM on speech MultiWoz task (DSTC-11 challenge), and found that this retrieval augmentation boosts model performance, achieving joint goal accuracy (38.6% vs 32.7%), slot error rate (20.6% vs 24.8%) and ASR word error rate (5.5% vs 6.7%). While demonstrated on dialog state tracking, our approach is broadly applicable to other speech tasks requiring contextual information or domain-specific entities, such as contextual ASR with biasing capability.

dialog history, dialog state, retriever, (13 more...)

arXiv.org Artificial Intelligence

2402.01828

Country: North America > United States > New York (0.05)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Rail (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)

Add feedback