AITopics | Discourse & Dialogue

Collaborating Authors

Discourse & Dialogue

Understanding Language in Conversations "The problems addressed in discourse research aim to answer two general kinds of questions: (1) what information is contained in extended sequences of utterances that goes beyond the meaning of the individual utterances themselves? (2) how does the context in which an utterance is used affect the meaning of the individual utterances, or parts of them?"
– Barbara Grosz. Overview of Chapter 6: Discourse and Dialogue, Survey of the State of the Art in Human Language Technology (1996).

News Overviews Instructional Materials AI-Alerts Classics

LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems

Zhang, Hao, Li, Weiwei, Chen, Rilin, Kothapally, Vinay, Yu, Meng, Yu, Dong

arXiv.org Artificial IntelligenceFeb-24-2025

Achieving full-duplex communication in spoken dialogue systems (SDS) requires real-time coordination between listening, speaking, and thinking. This paper proposes a semantic voice activity detection (VAD) module as a dialogue manager (DM) to efficiently manage turn-taking in full-duplex SDS. Implemented as a lightweight (0.5B) LLM fine-tuned on full-duplex conversation data, the semantic VAD predicts four control tokens to regulate turn-switching and turn-keeping, distinguishing between intentional and unintentional barge-ins while detecting query completion for handling user pauses and hesitations. By processing input speech in short intervals, the semantic VAD enables real-time decision-making, while the core dialogue engine (CDE) is only activated for response generation, reducing computational overhead. This design allows independent DM optimization without retraining the CDE, balancing interaction accuracy and inference efficiency for scalable, next-generation full-duplex SDS.

control token, dialogue system, interaction, (11 more...)

arXiv.org Artificial Intelligence

2502.14145

Country:

North America > United States (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

SentiFormer: Metadata Enhanced Transformer for Image Sentiment Analysis

Feng, Bin, Ruan, Shulan, Yang, Mingzheng, Han, Dongxuan, Liu, Huijie, Zhang, Kai, Liu, Qi

arXiv.org Artificial IntelligenceFeb-21-2025

As more and more internet users post images online to express their daily emotions, image sentiment analysis has attracted increasing attention. Recently, researchers generally tend to design different neural networks to extract visual features from images for sentiment analysis. Despite the significant progress, metadata, the data (e.g., text descriptions and keyword tags) for describing the image, has not been sufficiently explored in this task. In this paper, we propose a novel Metadata Enhanced Transformer for sentiment analysis (SentiFormer) to fuse multiple metadata and the corresponding image into a unified framework. Specifically, we first obtain multiple metadata of the image and unify the representations of diverse data. To adaptively learn the appropriate weights for each metadata, we then design an adaptive relevance learning module to highlight more effective information while suppressing weaker ones. Moreover, we further develop a cross-modal fusion module to fuse the adaptively learned representations and make the final prediction. Extensive experiments on three publicly available datasets demonstrate the superiority and rationality of our proposed method.

image sentiment analysis, metadata, sentiment analysis, (15 more...)

arXiv.org Artificial Intelligence

2502.15322

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > China > Anhui Province (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Stories that (are) Move(d by) Markets: A Causal Exploration of Market Shocks and Semantic Shifts across Different Partisan Groups

Drinkall, Felix, Zohren, Stefan, McMahon, Michael, Pierrehumbert, Janet B.

arXiv.org Artificial IntelligenceFeb-20-2025

Macroeconomic fluctuations and the narratives that shape them form a mutually reinforcing cycle: public discourse can spur behavioural changes leading to economic shifts, which then result in changes in the stories that propagate. We show that shifts in semantic embedding space can be causally linked to financial market shocks -- deviations from the expected market behaviour. Furthermore, we show how partisanship can influence the predictive power of text for market fluctuations and shape reactions to those same shocks. We also provide some evidence that text-based signals are particularly salient during unexpected events such as COVID-19, highlighting the value of language data as an exogenous variable in economic forecasting. Our findings underscore the bidirectional relationship between news outlets and market shocks, offering a novel empirical approach to studying their effect on each other.

computational linguistic, market shock, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2502.14497

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(14 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Banking & Finance > Trading (1.00)
Banking & Finance > Economy (1.00)
(3 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
(3 more...)

Add feedback

Multi-Scale and Multi-Objective Optimization for Cross-Lingual Aspect-Based Sentiment Analysis

Wu, Chengyan, Ma, Bolei, Deng, Ningyuan, He, Yanqing, Xue, Yun

arXiv.org Artificial IntelligenceFeb-19-2025

Aspect-based sentiment analysis (ABSA) is a sequence labeling task that has garnered growing research interest in multilingual contexts. However, recent studies lack more robust feature alignment and finer aspect-level alignment. In this paper, we propose a novel framework, Multi-Scale and Multi-Objective optimization (MSMO) for cross-lingual ABSA. During multi-scale alignment, we achieve cross-lingual sentence-level and aspect-level alignment, aligning features of aspect terms in different contextual environments. Specifically, we introduce code-switched bilingual sentences into the language discriminator and consistency training modules to enhance the model's robustness. During multi-objective optimization, we design two optimization objectives: supervised training and consistency training, aiming to enhance cross-lingual semantic alignment. To further improve model performance, we incorporate distilled knowledge of the target language into the model. Results show that MSMO significantly enhances cross-lingual ABSA by achieving state-of-the-art performance across multiple languages and models.

aspect term, computational linguistic, target language, (15 more...)

arXiv.org Artificial Intelligence

2502.13718

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(12 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Add feedback

FlexDuo: A Pluggable System for Enabling Full-Duplex Capabilities in Speech Dialogue Systems

Liao, Borui, Xu, Yulong, Ou, Jiao, Yang, Kaiyuan, Jian, Weihua, Wan, Pengfei, Zhang, Di

arXiv.org Artificial IntelligenceFeb-19-2025

Full-Duplex Speech Dialogue Systems (Full-Duplex SDS) have significantly enhanced the naturalness of human-machine interaction by enabling real-time bidirectional communication. However, existing approaches face challenges such as difficulties in independent module optimization and contextual noise interference due to highly coupled architectural designs and oversimplified binary state modeling. This paper proposes FlexDuo, a flexible full-duplex control module that decouples duplex control from spoken dialogue systems through a plug-and-play architectural design. Furthermore, inspired by human information-filtering mechanisms in conversations, we introduce an explicit Idle state. On one hand, the Idle state filters redundant noise and irrelevant audio to enhance dialogue quality. On the other hand, it establishes a semantic integrity-based buffering mechanism, reducing the risk of mutual interruptions while ensuring accurate response transitions. Experimental results on the Fisher corpus demonstrate that FlexDuo reduces the false interruption rate by 24.9% and improves response accuracy by 7.6% compared to integrated full-duplex dialogue system baselines. It also outperforms voice activity detection (VAD) controlled baseline systems in both Chinese and English dialogue quality. The proposed modular architecture and state-based dialogue model provide a novel technical pathway for building flexible and efficient duplex dialogue systems.

dialogue system, false interruption rate, half-duplex llm, (12 more...)

arXiv.org Artificial Intelligence

2502.13472

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.88)
(2 more...)

Add feedback

Task-Informed Anti-Curriculum by Masking Improves Downstream Performance on Text

Jarca, Andrei, Croitoru, Florinel Alin, Ionescu, Radu Tudor

arXiv.org Artificial IntelligenceFeb-18-2025

Masked language modeling has become a widely adopted unsupervised technique to pre-train language models. However, the process of selecting tokens for masking is random, and the percentage of masked tokens is typically fixed for the entire training process. In this paper, we propose to adjust the masking ratio and to decide which tokens to mask based on a novel task-informed anti-curriculum learning scheme. First, we harness task-specific knowledge about useful and harmful tokens in order to determine which tokens to mask. Second, we propose a cyclic decaying masking ratio, which corresponds to an anti-curriculum schedule (from hard to easy). We exemplify our novel task-informed anti-curriculum by masking (TIACBM) approach across three diverse downstream tasks: sentiment analysis, text classification by topic, and authorship attribution. Our findings suggest that TIACBM enhances the ability of the model to focus on key task-relevant features, contributing to statistically significant performance gains across tasks. We release our code at https://github.com/JarcaAndrei/TIACBM.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.12953

Country: Europe > Romania (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.50)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis

Wu, Chengyan, Ma, Bolei, Liu, Yihong, Zhang, Zheyu, Deng, Ningyuan, Li, Yanshu, Chen, Baolan, Zhang, Yi, Plank, Barbara, Xue, Yun

arXiv.org Artificial IntelligenceFeb-17-2025

Aspect-based sentiment analysis (ABSA) is a crucial task in information extraction and sentiment analysis, aiming to identify aspects with associated sentiment elements in text. However, existing ABSA datasets are predominantly English-centric, limiting the scope for multilingual evaluation and research. To bridge this gap, we present M-ABSA, a comprehensive dataset spanning 7 domains and 21 languages, making it the most extensive multilingual parallel dataset for ABSA to date. Our primary focus is on triplet extraction, which involves identifying aspect terms, aspect categories, and sentiment polarities. The dataset is constructed through an automatic translation process with human review to ensure quality. We perform extensive experiments using various baselines to assess performance and compatibility on M-ABSA. Our empirical findings highlight that the dataset enables diverse evaluation tasks, such as multilingual and multi-domain transfer learning, and large language model evaluation, underscoring its inclusivity and its potential to drive advancements in multilingual ABSA research.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.11824

Country:

Europe (1.00)
Asia > Middle East > UAE (0.28)
North America > United States > Minnesota (0.27)

Genre:

Research Report (0.81)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Education (0.50)
Consumer Products & Services (0.46)
Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(2 more...)

Add feedback

Self-Supervised Learning for Neural Topic Models with Variance-Invariance-Covariance Regularization

Xu, Weiran, Hirami, Kengo, Eguchi, Koji

arXiv.org Artificial IntelligenceFeb-14-2025

In our study, we propose a self-supervised neural topic model (NTM) that combines the power of NTMs and regularized self-supervised learning methods to improve performance. NTMs use neural networks to learn latent topics hidden behind the words in documents, enabling greater flexibility and the ability to estimate more coherent topics compared to traditional topic models. On the other hand, some self-supervised learning methods use a joint embedding architecture with two identical networks that produce similar representations for two augmented versions of the same input. Regularizations are applied to these representations to prevent collapse, which would otherwise result in the networks outputting constant or redundant representations for all inputs. Our model enhances topic quality by explicitly regularizing latent topic representations of anchor and positive samples. We also introduced an adversarial data augmentation method to replace the heuristic sampling method. We further developed several variation models including those on the basis of an NTM that incorporates contrastive learning with both positive and negative samples. Experimental results on three datasets showed that our models outperformed baselines and state-of-the-art models both quantitatively and qualitatively.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.09944

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Multi-level Conflict-Aware Network for Multi-modal Sentiment Analysis

Gao, Yubo, Wu, Haotian, Zhang, Lei

arXiv.org Artificial IntelligenceFeb-13-2025

Multimodal Sentiment Analysis (MSA) aims to recognize human emotions by exploiting textual, acoustic, and visual modalities, and thus how to make full use of the interactions between different modalities is a central challenge of MSA. Interaction contains alignment and conflict aspects. Current works mainly emphasize alignment and the inherent differences between unimodal modalities, neglecting the fact that there are also potential conflicts between bimodal combinations. Additionally, multi-task learning-based conflict modeling methods often rely on the unstable generated labels. To address these challenges, we propose a novel multi-level conflict-aware network (MCAN) for multimodal sentiment analysis, which progressively segregates alignment and conflict constituents from unimodal and bimodal representations, and further exploits the conflict constituents with the conflict modeling branch. In the conflict modeling branch, we conduct discrepancy constraints at both the representation and predicted output levels, avoiding dependence on the generated labels. Experimental results on the CMU-MOSI and CMU-MOSEI datasets demonstrate the effectiveness of the proposed MCAN.

artificial intelligence, natural language, representation, (16 more...)

arXiv.org Artificial Intelligence

2502.09675

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.84)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.84)

Add feedback

Unveiling Global Discourse Structures: Theoretical Analysis and NLP Applications in Argument Mining

van Le, Christopher

arXiv.org Artificial IntelligenceFeb-12-2025

Particularly in the structure of global discourse, coherence plays a pivotal role in human text comprehension and is a hallmark of high-quality text. This is especially true for persuasive texts, where coherent argument structures support claims effectively. This paper discusses and proposes methods for detecting, extracting and representing these global discourse structures in a proccess called Argument(ation) Mining. We begin by defining key terms and processes of discourse structure analysis, then continue to summarize existing research on the matter, and identify shortcomings in current argument component extraction and classification methods. Furthermore, we will outline an architecture for argument mining that focuses on making models more generalisable while overcoming challenges in the current field of research by utilizing novel NLP techniques. This paper reviews current knowledge, summarizes recent works, and outlines our NLP pipeline, aiming to contribute to the theoretical understanding of global discourse structures.

artificial intelligence, natural language, relation, (15 more...)

arXiv.org Artificial Intelligence

2502.08371

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Miyazaki Prefecture > Miyazaki (0.04)
(7 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback