AITopics | dailydialog

Collaborating Authors

dailydialog

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

03b92cd507ff5870df0db7f074728830-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 10:53:36 GMT

artificial intelligence, experiment, natural language, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.52)

Add feedback

A Unifying LIME and Shapley value to Dialogue Response Generation

Neural Information Processing SystemsNov-13-2025, 06:26:59 GMT

Interestingly, the lowest ratio occurs earlier compared to the experiments on DailyDialog.

artificial intelligence, natural language, shapley value, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.52)

Add feedback

Reading Between the Lines: The One-Sided Conversation Problem

Ebert, Victoria, Singh, Rishabh, Chen, Tuochao, Smith, Noah A., Gollakota, Shyamnath

arXiv.org Artificial IntelligenceNov-6-2025

Conversational AI is constrained in many real-world settings where only one side of a dialogue can be recorded, such as telemedicine, call centers, and smart glasses. We formalize this as the one-sided conversation problem (1SC): inferring and learning from one side of a conversation. We study two tasks: (1) reconstructing the missing speaker's turns for real-time use cases, and (2) generating summaries from one-sided transcripts. Evaluating prompting and finetuned models on MultiWOZ, DailyDialog, and Candor with both human A/B testing and LLM-as-a-judge metrics, we find that access to one future turn and information about utterance length improves reconstruction, placeholder prompting helps to mitigate hallucination, and while large models generate promising reconstructions with prompting, smaller models require finetuning. Further, high-quality summaries can be generated without reconstructing missing turns. We present 1SC as a novel challenge and report promising results that mark a step toward privacy-aware conversational AI.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.03056

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.27)

Genre:

Personal > Interview (0.67)
Research Report > New Finding (0.45)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(7 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

Do language models accommodate their users? A study of linguistic convergence

Blevins, Terra, Schmalwieser, Susanne, Roth, Benjamin

arXiv.org Artificial IntelligenceAug-6-2025

While large language models (LLMs) are generally considered proficient in generating language, how similar their language usage is to that of humans remains understudied. In this paper, we test whether models exhibit linguistic convergence, a core pragmatic element of human language communication, asking: do models adapt, or converge, to the linguistic patterns of their user? To answer this, we systematically compare model completions of exisiting dialogues to the original human responses across sixteen language models, three dialogue corpora, and a variety of stylometric features. We find that models strongly converge to the conversation's style, often significantly overfitting relative to the human baseline. While convergence patterns are often feature-specific, we observe consistent shifts in convergence across modeling settings, with instruction-tuned and larger models converging less than their pretrained counterparts. Given the differences between human and model convergence patterns, we hypothesize that the underlying mechanisms for these behaviors are very different.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2508.03276

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation

Dey, Suvodip, Desarkar, Maunendra Sankar

arXiv.org Artificial IntelligenceJan-17-2025

The standard language modeling (LM) loss by itself has been shown to be inadequate for effective dialogue modeling. As a result, various training approaches, such as auxiliary loss functions and leveraging human feedback, are being adopted to enrich open-domain dialogue systems. One such auxiliary loss function is Bag-of-Words (BoW) loss, defined as the cross-entropy loss for predicting all the words/tokens of the next utterance. In this work, we propose a novel auxiliary loss named Bag-of-Keywords (BoK) loss to capture the central thought of the response through keyword prediction and leverage it to enhance the generation of meaningful and interpretable responses in open-domain dialogue systems. BoK loss upgrades the BoW loss by predicting only the keywords or critical words/tokens of the next utterance, intending to estimate the core idea rather than the entire response. We incorporate BoK loss in both encoder-decoder (T5) and decoder-only (DialoGPT) architecture and train the models to minimize the weighted sum of BoK and LM (BoK-LM) loss. We perform our experiments on two popular open-domain dialogue datasets, DailyDialog and Persona-Chat. We show that the inclusion of BoK loss improves the dialogue generation of backbone models while also enabling post-hoc interpretability. We also study the effectiveness of BoK-LM loss as a reference-free metric and observe comparable performance to the state-of-the-art metrics on various dialogue evaluation datasets.

computational linguistic, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2024.sigdial-1.48

2501.10328

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(13 more...)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (0.93)
Media > Film (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.75)

Add feedback

Measuring the Robustness of Reference-Free Dialogue Evaluation Systems

Vasselli, Justin, Nohejl, Adam, Watanabe, Taro

arXiv.org Artificial IntelligenceJan-12-2025

Advancements in dialogue systems powered by large language models (LLMs) have outpaced the development of reliable evaluation metrics, particularly for diverse and creative responses. We present a benchmark for evaluating the robustness of reference-free dialogue metrics against four categories of adversarial attacks: speaker tag prefixes, static responses, ungrammatical responses, and repeated conversational context. We analyze metrics such as DialogRPT, UniEval, and PromptEval -- a prompt-based method leveraging LLMs -- across grounded and ungrounded datasets. By examining both their correlation with human judgment and susceptibility to adversarial attacks, we find that these two axes are not always aligned; metrics that appear to be equivalent when judged by traditional benchmarks may, in fact, vary in their scores of adversarial responses. These findings motivate the development of nuanced evaluation frameworks to address real-world dialogue challenges.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2501.06728

Country:

North America > Canada (0.28)
Asia > Middle East (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (0.69)
Government > Military (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Disentangling Preference Representation and Text Generation for Efficient Individual Preference Alignment

Zhang, Jianfei, Bai, Jun, Li, Bei, Wang, Yanmeng, Li, Rumei, Lin, Chenghua, Rong, Wenge

arXiv.org Artificial IntelligenceDec-30-2024

Aligning Large Language Models (LLMs) with general human preferences has been proved crucial in improving the interaction quality between LLMs and human. However, human values are inherently diverse among different individuals, making it insufficient to align LLMs solely with general preferences. To address this, personalizing LLMs according to individual feedback emerges as a promising solution. Nonetheless, this approach presents challenges in terms of the efficiency of alignment algorithms. In this work, we introduce a flexible paradigm for individual preference alignment. Our method fundamentally improves efficiency by disentangling preference representation from text generation in LLMs. We validate our approach across multiple text generation tasks and demonstrate that it can produce aligned quality as well as or better than PEFT-based methods, while reducing additional training time for each new individual preference by $80\%$ to $90\%$ in comparison with them.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.20834

Country:

North America > United States (1.00)
Europe (0.93)
North America > Canada (0.67)
Asia (0.67)

Genre: Research Report (1.00)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ChatZero:Zero-shot Cross-Lingual Dialogue Generation via Pseudo-Target Language

Liu, Yongkang, Shi, Feng, Wang, Daling, Zhang, Yifei, Schütze, Hinrich

arXiv.org Artificial IntelligenceAug-16-2024

Although large language models(LLMs) show amazing capabilities, among various exciting applications discovered for LLMs fall short in other low-resource languages. Besides, most existing methods depend on large-scale dialogue corpora and thus building systems for dialogue generation in a zero-shot scenario remains a considerable challenge. To address this challenge, we propose a novel end-to-end zero-shot dialogue generation model ChatZero based on cross-lingual code-switching method. First, we construct code-switching language and pseudo-target language with placeholders. Then for cross-lingual semantic transfer, we employ unsupervised contrastive learning to minimize the semantics gap of the source language, code-switching language, and pseudo-target language that are mutually positive examples in the high dimensional semantic space. Experiments on the multilingual DailyDialog and DSTC7-AVSD datasets demonstrate that ChatZero can achieve more than 90\% of the original performance under the zero-shot case compared to supervised learning, and achieve state-of-the-art performance compared with other baselines.

chatzero, placeholder, target language, (17 more...)

arXiv.org Artificial Intelligence

2408.08724

Country:

Asia > China (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

KoDialogBench: Evaluating Conversational Understanding of Language Models with Korean Dialogue Benchmark

Jang, Seongbo, Lee, Seonghyeon, Yu, Hwanjo

arXiv.org Artificial IntelligenceJun-17-2024

As language models are often deployed as chatbot assistants, it becomes a virtue for models to engage in conversations in a user's first language. While these models are trained on a wide range of languages, a comprehensive evaluation of their proficiency in low-resource languages such as Korean has been lacking. In this work, we introduce KoDialogBench, a benchmark designed to assess language models' conversational capabilities in Korean. To this end, we collect native Korean dialogues on daily topics from public sources, or translate dialogues from other languages. We then structure these conversations into diverse test datasets, spanning from dialogue comprehension to response selection tasks. Leveraging the proposed benchmark, we conduct extensive evaluations and analyses of various language models to measure a foundational understanding of Korean dialogues. Experimental results indicate that there exists significant room for improvement in models' conversation skills. Furthermore, our in-depth comparisons across different language models highlight the effectiveness of recent training techniques in enhancing conversational proficiency. We anticipate that KoDialogBench will promote the progress towards conversation-aware Korean language models.

computational linguistic, dialogue, language model, (15 more...)

arXiv.org Artificial Intelligence

2402.17377

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(18 more...)

Genre: Research Report > New Finding (0.46)

Industry: Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Context-Aware Siamese Networks for Efficient Emotion Recognition in Conversation

Gendron, Barbara, Guibon, Gaël

arXiv.org Artificial IntelligenceApr-17-2024

The advent of deep learning models has made a considerable contribution to the achievement of Emotion Recognition in Conversation (ERC). However, this task still remains an important challenge due to the plurality and subjectivity of human emotions. Previous work on ERC provides predictive models using mostly graph-based conversation representations. In this work, we propose a way to model the conversational context that we incorporate into a metric learning training strategy, with a two-step process. This allows us to perform ERC in a flexible classification scenario and to end up with a lightweight yet efficient model. Using metric learning through a Siamese Network architecture, we achieve 57.71 in macro F1 score for emotion classification in conversation on DailyDialog dataset, which outperforms the related work. This state-of-the-art result is promising regarding the use of metric learning for emotion recognition, yet perfectible compared to the microF1 score obtained.

computational linguistic, emotion recognition, representation, (11 more...)

arXiv.org Artificial Intelligence

2404.11141

Country:

North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)
(8 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback