AITopics | Discourse & Dialogue

Collaborating Authors

Discourse & Dialogue

Understanding Language in Conversations "The problems addressed in discourse research aim to answer two general kinds of questions: (1) what information is contained in extended sequences of utterances that goes beyond the meaning of the individual utterances themselves? (2) how does the context in which an utterance is used affect the meaning of the individual utterances, or parts of them?"
– Barbara Grosz. Overview of Chapter 6: Discourse and Dialogue, Survey of the State of the Art in Human Language Technology (1996).

News Overviews Instructional Materials AI-Alerts Classics

Survey of Aspect-based Sentiment Analysis Datasets

Chebolu, Siva Uday Sampreeth, Dernoncourt, Franck, Lipka, Nedim, Solorio, Thamar

arXiv.org Artificial IntelligenceSep-21-2023

Aspect-based sentiment analysis (ABSA) is a natural language processing problem that requires analyzing user-generated reviews to determine: a) The target entity being reviewed, b) The high-level aspect to which it belongs, and c) The sentiment expressed toward the targets and the aspects. Numerous yet scattered corpora for ABSA make it difficult for researchers to identify corpora best suited for a specific ABSA subtask quickly. This study aims to present a database of corpora that can be used to train and assess autonomous ABSA systems. Additionally, we provide an overview of the major corpora for ABSA and its subtasks and highlight several features that researchers should consider when selecting a corpus. Finally, we discuss the advantages and disadvantages of current collection approaches and make recommendations for future corpora creation. This survey examines 65 publicly available ABSA datasets covering over 25 domains, including 45 English and 20 other languages datasets.

computational linguistic, dataset, sentiment analysis, (13 more...)

arXiv.org Artificial Intelligence

2204.05232

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(20 more...)

Genre: Overview (1.00)

Industry: Consumer Products & Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Cross-lingual Data Augmentation for Document-grounded Dialog Systems in Low Resource Languages

Gou, Qi, Xia, Zehua, Du, Wenzhe

arXiv.org Artificial IntelligenceSep-20-2023

This paper proposes a framework to address the issue of data scarcity in Document-Grounded Dialogue Systems(DGDS). Our model leverages high-resource languages to enhance the capability of dialogue generation in low-resource languages. Specifically, We present a novel pipeline CLEM (Cross-Lingual Enhanced Model) including adversarial training retrieval (Retriever and Re-ranker), and Fid (fusion-in-decoder) generator. To further leverage high-resource language, we also propose an innovative architecture to conduct alignment across different languages with translated training. Extensive experiment results demonstrate the effectiveness of our model and we achieved 4th place in the DialDoc 2023 Competition. Therefore, CLEM can serve as a solution to resource scarcity in DGDS and provide useful guidance for multi-lingual alignment tasks.

cross-lingual data augmentation, document-grounded dialog system, low resource language

arXiv.org Artificial Intelligence

2305.14949

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.40)

Add feedback

Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model

Zhou, Xinyu, Chen, Delong, Chen, Yudong

arXiv.org Artificial IntelligenceSep-19-2023

This paper explores the potential of constructing an AI spoken dialogue system that "thinks how to respond" and "thinks how to speak" simultaneously, which more closely aligns with the human speech production process compared to the current cascade pipeline of independent chatbot and Text-to-Speech (TTS) modules. We hypothesize that Large Language Models (LLMs) with billions of parameters possess significant speech understanding capabilities and can jointly model dialogue responses and linguistic features. We conduct two sets of experiments: 1) Prosodic structure prediction, a typical front-end task in TTS, demonstrating the speech understanding ability of LLMs, and 2) Further integrating dialogue response and a wide array of linguistic features using a unified encoding format. Our results indicate that the LLM-based approach is a promising direction for building unified spoken dialogue systems.

dialogue response and speech synthesis, joint modeling, language model

arXiv.org Artificial Intelligence

2309.11

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)

Add feedback

PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded Dialogue Systems

Wilie, Bryan, Xu, Yan, Chung, Willy, Cahyawijaya, Samuel, Lovenia, Holy, Fung, Pascale

arXiv.org Artificial IntelligenceSep-19-2023

Grounding dialogue response generation on external knowledge is proposed to produce informative and engaging responses. However, current knowledge-grounded dialogue (KGD) systems often fail to align the generated responses with human-preferred qualities due to several issues like hallucination and the lack of coherence. Upon analyzing multiple language model generations, we observe the presence of alternative generated responses within a single decoding process. These alternative responses are more faithful and exhibit a comparable or higher level of relevance to prior conversational turns compared to the optimal responses prioritized by the decoding processes. To address these challenges and driven by these observations, we propose Polished \& Informed Candidate Scoring (PICK), a generation re-scoring framework that empowers models to generate faithful and relevant responses without requiring additional labeled data or model tuning. Through comprehensive automatic and human evaluations, we demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history. Furthermore, PICK consistently improves the system's performance with both oracle and retrieved knowledge in all decoding strategies. We provide the detailed implementation in https://github.com/bryanwilie/pick .

dialogue, dialogue history, knowledge, (13 more...)

arXiv.org Artificial Intelligence

2309.10413

Country:

Asia > China > Hong Kong (0.05)
Europe > Belgium (0.04)
North America > United States > New York (0.04)
Europe > Monaco (0.04)

Genre: Research Report (1.00)

Industry:

Media > Music (0.68)
Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

The ParlaSent multilingual training dataset for sentiment identification in parliamentary proceedings

Mochtak, Michal, Rupnik, Peter, Ljubešić, Nikola

arXiv.org Artificial IntelligenceSep-18-2023

Sentiments inherently drive politics. How we receive and process information plays an essential role in political decision-making, shaping our judgment with strategic consequences both on the level of legislators and the masses. If sentiment plays such an important role in politics, how can we study and measure it systematically? The paper presents a new dataset of sentiment-annotated sentences, which are used in a series of experiments focused on training a robust sentiment classifier for parliamentary proceedings. The paper also introduces the first domain-specific LLM for political science applications additionally pre-trained on 1.72 billion domain-specific words from proceedings of 27 European parliaments. We present experiments demonstrating how the additional pre-training of LLM on parliamentary data can significantly improve the model downstream performance on the domain-specific tasks, in our case, sentiment detection in parliamentary proceedings. We further show that multilingual models perform very well on unseen languages and that additional data from other languages significantly improves the target parliament's results. The paper makes an important contribution to multiple domains of social sciences and bridges them with computer science and computational linguistics. Lastly, it sets up a more robust approach to sentiment analysis of political texts in general, which allows scholars to study political sentiment from a comparative perspective using standardized tools and techniques.

dataset, parliament, sentiment, (14 more...)

arXiv.org Artificial Intelligence

2309.09783

Country:

Europe > Serbia (0.14)
Europe > Slovakia (0.14)
Europe > Germany (0.14)
(22 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > North America Government > United States Government (0.68)
Government > Regional Government > Europe Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Add feedback

Sentiment Analysis and Effect of COVID-19 Pandemic using College SubReddit Data

Yan, Tian, Liu, Fang

arXiv.org Artificial IntelligenceSep-18-2023

Background: The COVID-19 pandemic has affected our society and human well-being in various ways. In this study, we investigate how the pandemic has influenced people's emotions and psychological states compared to a pre-pandemic period using real-world data from social media. Method: We collected Reddit social media data from 2019 (pre-pandemic) and 2020 (pandemic) from the subreddits communities associated with eight universities. We applied the pre-trained Robustly Optimized BERT pre-training approach (RoBERTa) to learn text embedding from the Reddit messages, and leveraged the relational information among posted messages to train a graph attention network (GAT) for sentiment classification. Finally, we applied model stacking to combine the prediction probabilities from RoBERTa and GAT to yield the final classification on sentiment. With the model-predicted sentiment labels on the collected data, we used a generalized linear mixed-effects model to estimate the effects of pandemic and in-person teaching during the pandemic on sentiment. Results: The results suggest that the odds of negative sentiments in 2020 (pandemic) were 25.7% higher than the odds in 2019 (pre-pandemic) with a $p$-value $<0.001$; and the odds of negative sentiments associated in-person learning were 48.3% higher than with remote learning in 2020 with a $p$-value of 0.029. Conclusions: Our study results are consistent with the findings in the literature on the negative impacts of the pandemic on people's emotions and psychological states. Our study contributes to the growing real-world evidence on the various negative impacts of the pandemic on our society; it also provides a good example of using both ML techniques and statistical modeling and inference to make better use of real-world data.

information, negative sentiment, sentiment, (13 more...)

arXiv.org Artificial Intelligence

2112.04351

Country:

North America > United States > Michigan (0.04)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
Europe > Denmark (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Epidemiology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

How People Perceive The Dynamic Zero-COVID Policy: A Retrospective Analysis From The Perspective of Appraisal Theory

Yang, Na, Zhou, Kyrie Zhixuan, Li, Yunzhe

arXiv.org Artificial IntelligenceSep-17-2023

The Dynamic Zero-COVID Policy in China spanned three years and diverse emotional responses have been observed at different times. In this paper, we retrospectively analyzed public sentiments and perceptions of the policy, especially regarding how they evolved over time, and how they related to people's lived experiences. Through sentiment analysis of 2,358 collected Weibo posts, we identified four representative points, i.e., policy initialization, sharp sentiment change, lowest sentiment score, and policy termination, for an in-depth discourse analysis through the lens of appraisal theory. In the end, we reflected on the evolving public sentiments toward the Dynamic Zero-COVID Policy and proposed implications for effective epidemic prevention and control measures for future crises.

attitude, dynamic zero-covid policy, epidemic prevention, (14 more...)

arXiv.org Artificial Intelligence

2309.09324

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Shanghai > Shanghai (0.08)
Asia > China > Hubei Province > Wuhan (0.05)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.57)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.36)

Add feedback

Has Sentiment Returned to the Pre-pandemic Level? A Sentiment Analysis Using U.S. College Subreddit Data from 2019 to 2022

Yan, Tian, Liu, Fang

arXiv.org Artificial IntelligenceSep-15-2023

As impact of COVID-19 pandemic winds down, both individuals and society gradually return to pre-pandemic activities. This study aims to explore how people's emotions have changed from the pre-pandemic during the pandemic to post-emergency period and whether it has returned to pre-pandemic level. We collected Reddit data in 2019 (pre-pandemic), 2020 (peak pandemic), 2021, and 2022 (late stages of pandemic, transitioning period to post-emergency period) from subreddits in 128 universities/colleges in the U.S., and a set of school-level characteristics. We predicted two sets of sentiments from a pre-trained Robustly Optimized BERT pre-training approach (RoBERTa) and graph attention network (GAT) that leverages both rich semantic and relational information among posted messages and then applied a logistic stacking method to obtain the final sentiment classification. After obtaining sentiment label for each message, we used a generalized linear mixed-effects model to estimate temporal trend in sentiment from 2019 to 2022 and how school-level factors may affect sentiment. Compared to the year 2019, the odds of negative sentiment in years 2020, 2021, and 2022 are 24%, 4.3%, and 10.3% higher, respectively, which are all statistically significant(adjusted $p$<0.05). Our study findings suggest a partial recovery in the sentiment composition in the post-pandemic-emergency era. The results align with common expectations and provide a detailed quantification of how sentiments have evolved from 2019 to 2022.

negative sentiment, sentiment, university, (16 more...)

arXiv.org Artificial Intelligence

2309.08845

Country:

North America > United States > Florida > Hillsborough County > University (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > North Carolina (0.04)
(49 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Education > Educational Setting > Higher Education (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Chinese Fine-Grained Financial Sentiment Analysis with Large Language Models

Lan, Yinyu, Wu, Yanru, Xu, Wang, Feng, Weiqiang, Zhang, Youhao

arXiv.org Artificial IntelligenceSep-15-2023

Entity-level fine-grained sentiment analysis in the financial domain is a crucial subtask of sentiment analysis and currently faces numerous challenges. The primary challenge stems from the lack of high-quality and large-scale annotated corpora specifically designed for financial text sentiment analysis, which in turn limits the availability of data necessary for developing effective text processing techniques. Recent advancements in large language models (LLMs) have yielded remarkable performance in natural language processing tasks, primarily centered around language pattern matching. In this paper, we propose a novel and extensive Chinese fine-grained financial sentiment analysis dataset, FinChina SA, for enterprise early warning. We thoroughly evaluate and experiment with well-known existing open-source LLMs using our dataset. We firmly believe that our dataset will serve as a valuable resource to advance the exploration of real-world financial sentiment analysis tasks, which should be the focus of future research. The FinChina SA dataset is publicly available at https://github.com/YerayL/FinChina-SA

dataset, language model, sentiment analysis, (12 more...)

arXiv.org Artificial Intelligence

2306.14096

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Banking & Finance > Trading (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Psychological Metrics for Dialog System Evaluation

Giorgi, Salvatore, Havaldar, Shreya, Ahmed, Farhan, Akhtar, Zuhaib, Vaidya, Shalaka, Pan, Gary, Ungar, Lyle H., Schwartz, H. Andrew, Sedoc, Joao

arXiv.org Artificial IntelligenceSep-15-2023

We present metrics for evaluating dialog systems through a psychologically-grounded "human" lens in which conversational agents express a diversity of both states (e.g., emotion) and traits (e.g., personality), just as people do. We present five interpretable metrics from established psychology that are fundamental to human communication and relationships: emotional entropy, linguistic style and emotion matching, agreeableness, and empathy. These metrics can be applied (1) across dialogs and (2) on turns within dialogs. The psychological metrics are compared against seven state-of-the-art traditional metrics (e.g., BARTScore and BLEURT) on seven standard dialog system data sets. We also introduce a novel data set, the Three Bot Dialog Evaluation Corpus, which consists of annotated conversations from ChatGPT, GPT-3, and BlenderBot. We demonstrate that our proposed metrics offer novel information; they are uncorrelated with traditional metrics, can be used to meaningfully compare dialog systems, and lead to increased accuracy (beyond existing traditional metrics) in predicting crowd-sourced dialog judgements. The interpretability and unique signal of our psychological metrics make them a valuable tool for evaluating and improving dialog systems.

metric, psychological metric, traditional metric, (12 more...)

arXiv.org Artificial Intelligence

2305.14757

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > Jordan (0.04)
(4 more...)

Genre: Research Report > New Finding (0.71)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback