AITopics | Discourse & Dialogue

Collaborating Authors

Discourse & Dialogue

Understanding Language in Conversations "The problems addressed in discourse research aim to answer two general kinds of questions: (1) what information is contained in extended sequences of utterances that goes beyond the meaning of the individual utterances themselves? (2) how does the context in which an utterance is used affect the meaning of the individual utterances, or parts of them?"
– Barbara Grosz. Overview of Chapter 6: Discourse and Dialogue, Survey of the State of the Art in Human Language Technology (1996).

News Overviews Instructional Materials AI-Alerts Classics

Arabic Text Sentiment Analysis: Reinforcing Human-Performed Surveys with Wider Topic Analysis

Almurqren, Latifah, Hodgson, Ryan, Cristea, Alexandra

arXiv.org Artificial IntelligenceMar-4-2024

Sentiment analysis (SA) has been, and is still, a thriving research area. However, the task of Arabic sentiment analysis (ASA) is still underrepresented in the body of research. This study offers the first in-depth and in-breadth analysis of existing ASA studies of textual content and identifies their common themes, domains of application, methods, approaches, technologies and algorithms used. The in-depth study manually analyses 133 ASA papers published in the English language between 2002 and 2020 from four academic databases (SAGE, IEEE, Springer, WILEY) and from Google Scholar. The in-breadth study uses modern, automatic machine learning techniques, such as topic modelling and temporal analysis, on Open Access resources, to reinforce themes and trends identified by the prior study, on 2297 ASA publications between 2010-2020. The main findings show the different approaches used for ASA: machine learning, lexicon-based and hybrid approaches. Other findings include ASA 'winning' algorithms (SVM, NB, hybrid methods). Deep learning methods, such as LSTM can provide higher accuracy, but for ASA sometimes the corpora are not large enough to support them. Additionally, whilst there are some ASA corpora and lexicons, more are required. Specifically, Arabic tweets corpora and datasets are currently only moderately sized. Moreover, Arabic lexicons that have high coverage contain only Modern Standard Arabic (MSA) words, and those with Arabic dialects are quite small. Thus, new corpora need to be created. On the other hand, ASA tools are stringently lacking. There is a need to develop ASA tools that can be used in industry, as well as in academia, for Arabic text SA. Hence, our study offers insights into the challenges associated with ASA research and provides suggestions for ways to move the field forward such as lack of Dialectical Arabic resource, Arabic tweets, corpora and data sets for SA.

classification, lexicon, sentiment analysis, (16 more...)

arXiv.org Artificial Intelligence

2403.01921

Country:

Asia > Middle East > Saudi Arabia > Riyadh Province > Riyadh (0.04)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)
Asia > Middle East > Yemen > Amanat Al Asimah > Sanaa (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Telecommunications (0.93)
Information Technology > Services (0.68)
Health & Medicine (0.67)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(3 more...)

Add feedback

A comprehensive cross-language framework for harmful content detection with the aid of sentiment analysis

Dehghani, Mohammad

arXiv.org Artificial IntelligenceMar-2-2024

In today's digital world, social media plays a significant role in facilitating communication and content sharing. However, the exponential rise in user-generated content has led to challenges in maintaining a respectful online environment. In some cases, users have taken advantage of anonymity in order to use harmful language, which can negatively affect the user experience and pose serious social problems. Recognizing the limitations of manual moderation, automatic detection systems have been developed to tackle this problem. Nevertheless, several obstacles persist, including the absence of a universal definition for harmful language, inadequate datasets across languages, the need for detailed annotation guideline, and most importantly, a comprehensive framework. This study aims to address these challenges by introducing, for the first time, a detailed framework adaptable to any language. This framework encompasses various aspects of harmful language detection. A key component of the framework is the development of a general and detailed annotation guideline. Additionally, the integration of sentiment analysis represents a novel approach to enhancing harmful language detection. Also, a definition of harmful language based on the review of different related concepts is presented. To demonstrate the effectiveness of the proposed framework, its implementation in a challenging low-resource language is conducted. We collected a Persian dataset and applied the annotation guideline for harmful detection and sentiment analysis. Next, we present baseline experiments utilizing machine and deep learning methods to set benchmarks. Results prove the framework's high performance, achieving an accuracy of 99.4% in offensive language detection and 66.2% in sentiment analysis.

detection, harmful language, language detection, (15 more...)

arXiv.org Artificial Intelligence

2403.0127

Country:

Europe > Greece (0.04)
Asia > Singapore (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
(20 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.88)

Industry:

Media > News (1.00)
Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(6 more...)

Add feedback

Learning Intrinsic Dimension via Information Bottleneck for Explainable Aspect-based Sentiment Analysis

Cheng, Zhenxiao, Zhou, Jie, Wu, Wen, Chen, Qin, He, Liang

arXiv.org Artificial IntelligenceFeb-28-2024

Gradient-based explanation methods are increasingly used to interpret neural models in natural language processing (NLP) due to their high fidelity. Such methods determine word-level importance using dimension-level gradient values through a norm function, often presuming equal significance for all gradient dimensions. However, in the context of Aspect-based Sentiment Analysis (ABSA), our preliminary research suggests that only specific dimensions are pertinent. To address this, we propose the Information Bottleneck-based Gradient (\texttt{IBG}) explanation framework for ABSA. This framework leverages an information bottleneck to refine word embeddings into a concise intrinsic dimension, maintaining essential features and omitting unrelated information. Comprehensive tests show that our \texttt{IBG} approach considerably improves both the models' performance and interpretability by identifying sentiment-aware features.

dimension, interpretability, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2402.18145

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.87)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.87)

Add feedback

Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future

Li, Minzhi, Shi, Weiyan, Ziems, Caleb, Yang, Diyi

arXiv.org Artificial IntelligenceFeb-27-2024

As Natural Language Processing (NLP) systems become increasingly integrated into human social life, these technologies will need to increasingly rely on social intelligence. Although there are many valuable datasets that benchmark isolated dimensions of social intelligence, there does not yet exist any body of work to join these threads into a cohesive subfield in which researchers can quickly identify research gaps and future directions. Towards this goal, we build a Social AI Data Infrastructure, which consists of a comprehensive social AI taxonomy and a data library of 480 NLP datasets. Our infrastructure allows us to analyze existing dataset efforts, and also evaluate language models' performance in different social intelligence aspects. Our analyses demonstrate its Figure 1: Our Social Intelligence Data Infrastructure utility in enabling a thorough understanding of gives a comprehensive overview and synthesis of social current data landscape and providing a holistic intelligence in NLP, with a theoretically grounded taxonomy perspective on potential directions for future and an NLP data library. Researchers can use dataset development. We show there is a need our infrastructure to build and organize tasks, evaluate for multifaceted datasets, increased diversity in language models and derive future insights.

dataset, intelligence, social intelligence, (13 more...)

arXiv.org Artificial Intelligence

2403.14659

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Singapore (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)

Industry:

Information Technology (0.93)
Education (0.67)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.46)

Add feedback

ESG Sentiment Analysis: comparing human and language model performance including GPT

Derrick, Karim

arXiv.org Artificial IntelligenceFeb-26-2024

In this paper we explore the challenges of measuring sentiment in relation to Environmental, Social and Governance (ESG) social media. ESG has grown in importance in recent years with a surge in interest from the financial sector and the performance of many businesses has become based in part on their ESG related reputations. The use of sentiment analysis to measure ESG related reputation has developed and with it interest in the use of machines to do so. The era of digital media has created an explosion of new media sources, driven by the growth of social media platforms. This growing data environment has become an excellent source for behavioural insight studies across many disciplines that includes politics, healthcare and market research. Our study seeks to compare human performance with the cutting edge in machine performance in the measurement of ESG related sentiment. To this end researchers classify the sentiment of 150 tweets and a reliability measure is made. A gold standard data set is then established based on the consensus of 3 researchers and this data set is then used to measure the performance of different machine approaches: one based on the VADER dictionary approach to sentiment classification and then multiple language model approaches, including Llama2, T5, Mistral, Mixtral, FINBERT, GPT3.5 and GPT4.

agreement, gold standard, reliability, (12 more...)

arXiv.org Artificial Intelligence

2402.1665

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Minnesota (0.04)
North America > United States > District of Columbia > Washington (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Banking & Finance > Trading (1.00)
Media > News (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Add feedback

Generating Effective Ensembles for Sentiment Analysis

Etelis, Itay, Rosenfeld, Avi, Weinberg, Abraham Itzhak, Sarne, David

arXiv.org Artificial IntelligenceFeb-26-2024

In recent years, transformer models have revolutionized Natural Language Processing (NLP), achieving exceptional results across various tasks, including Sentiment Analysis (SA). As such, current state-of-the-art approaches for SA predominantly rely on transformer models alone, achieving impressive accuracy levels on benchmark datasets. In this paper, we show that the key for further improving the accuracy of such ensembles for SA is to include not only transformers, but also traditional NLP models, despite the inferiority of the latter compared to transformer models. However, as we empirically show, this necessitates a change in how the ensemble is constructed, specifically relying on the Hierarchical Ensemble Construction (HEC) algorithm we present. Our empirical studies across eight canonical SA datasets reveal that ensembles incorporating a mix of model types, structured via HEC, significantly outperform traditional ensembles. Finally, we provide a comparative analysis of the performance of the HEC and GPT-4, demonstrating that while GPT-4 closely approaches state-of-the-art SA methods, it remains outperformed by our proposed ensemble strategy.

accuracy, dataset, ensemble, (13 more...)

arXiv.org Artificial Intelligence

2402.167

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Europe > Italy > Campania > Naples (0.04)
(2 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Add feedback

Let's Rectify Step by Step: Improving Aspect-based Sentiment Analysis with Diffusion Models

Liu, Shunyu, Zhou, Jie, Zhu, Qunxi, Chen, Qin, Bai, Qingchun, Xiao, Jun, He, Liang

arXiv.org Artificial IntelligenceFeb-23-2024

Aspect-Based Sentiment Analysis (ABSA) stands as a crucial task in predicting the sentiment polarity associated with identified aspects within text. However, a notable challenge in ABSA lies in precisely determining the aspects' boundaries (start and end indices), especially for long ones, due to users' colloquial expressions. We propose DiffusionABSA, a novel diffusion model tailored for ABSA, which extracts the aspects progressively step by step. Particularly, DiffusionABSA gradually adds noise to the aspect terms in the training process, subsequently learning a denoising process that progressively restores these terms in a reverse manner. To estimate the boundaries, we design a denoising neural network enhanced by a syntax-aware temporal attention mechanism to chronologically capture the interplay between aspects and surrounding text. Empirical evaluations conducted on eight benchmark datasets underscore the compelling advantages offered by DiffusionABSA when compared against robust baseline models.

computational linguistic, diffusion model, diffusionabsa, (14 more...)

arXiv.org Artificial Intelligence

2402.15289

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Shanghai > Shanghai (0.05)
(11 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.87)

Add feedback

CARBD-Ko: A Contextually Annotated Review Benchmark Dataset for Aspect-Level Sentiment Classification in Korean

Jang, Dongjun, Seo, Jean, Byun, Sungjoo, Kim, Taekyoung, Kim, Minseok, Shin, Hyopil

arXiv.org Artificial IntelligenceFeb-22-2024

The effectiveness of various pretrained language models, including BERT [Devlin et al., 2018], XLNet [Yang et al., 2019], BART [Lewis et al., 2020], and GPT-3, in sentiment classification, a significant downstream task, has been extensively studied. Current research in sentiment classification often focuses on identifying sentiment polarities at the aspect level, leading to the emergence of aspect-based sentiment classification (ABSC). Many studies have achieved impressive results and introduced innovative approaches to tackle the ABSC task. For instance, Sun et al. [2019] utilized BERT to transform ABSC tasks into sentence-pair classification, which has influenced subsequent methodologies [Hu et al., 2022]. Additionally, generative models like BART [Lewis et al., 2020] have been employed by Yan et al. [2021] to convert ABSC tasks into sequence-to-sequence tasks, enabling the prediction of token sequences representing identified aspects and associated sentiments. Furthermore, Li et al. [2021a] reframed ABSC tasks as masked language modeling tasks, effectively bridging the performance gap between pre-training and ABSC tasks. Despite numerous attempts to address aspect-level sentiment classification, the primary focus has been on improving aspect-level sentiment polarity performance through specialized datasets and training methodologies. However, it is equally crucial for models to predict not only the in-context polarity of aspects but also their aspect polarity.

dataset, polarity, siamese network, (9 more...)

arXiv.org Artificial Intelligence

2402.15046

Country:

Asia > South Korea > Seoul > Seoul (0.06)
North America > Dominican Republic (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Domain Generalization via Causal Adjustment for Cross-Domain Sentiment Analysis

Wang, Siyin, Zhou, Jie, Chen, Qin, Zhang, Qi, Gui, Tao, Huang, Xuanjing

arXiv.org Artificial IntelligenceFeb-22-2024

Domain adaption has been widely adapted for cross-domain sentiment analysis to transfer knowledge from the source domain to the target domain. Whereas, most methods are proposed under the assumption that the target (test) domain is known, making them fail to generalize well on unknown test data that is not always available in practice. In this paper, we focus on the problem of domain generalization for cross-domain sentiment analysis. Specifically, we propose a backdoor adjustment-based causal model to disentangle the domain-specific and domain-invariant representations that play essential roles in tackling domain shift. First, we rethink the cross-domain sentiment analysis task in a causal view to model the causal-and-effect relationships among different variables. Then, to learn an invariant feature representation, we remove the effect of domain confounders (e.g., domain knowledge) using the backdoor adjustment. A series of experiments over many homologous and diverse datasets show the great performance and robustness of our model by comparing it with the state-of-the-art domain generalization baselines.

computational linguistic, generalization, representation, (14 more...)

arXiv.org Artificial Intelligence

2402.14536

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(10 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

From Adoption to Adaption: Tracing the Diffusion of New Emojis on Twitter

Zhou, Yuhang, Lu, Xuan, Ai, Wei

arXiv.org Artificial IntelligenceFeb-21-2024

In the rapidly evolving landscape of social media, the introduction of new emojis in Unicode release versions presents a structured opportunity to explore digital language evolution. Analyzing a large dataset of sampled English tweets, we examine how newly released emojis gain traction and evolve in meaning. We find that community size of early adopters and emoji semantics are crucial in determining their popularity. Certain emojis experienced notable shifts in the meanings and sentiment associations during the diffusion process. Additionally, we propose a novel framework utilizing language models to extract words and pre-existing emojis with semantically similar contexts, which enhances interpretation of new emojis. The framework demonstrates its effectiveness in improving sentiment classification performance by substituting unknown new emojis with familiar ones. This study offers a new perspective in understanding how new language units are adopted, adapted, and integrated into the fabric of online communication.

emoji, new emoji, tweet, (17 more...)

arXiv.org Artificial Intelligence

2402.14187

Country:

North America > United States > Arizona (0.04)
Asia > Middle East > Israel (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry:

Information Technology (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.49)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.49)
(3 more...)

Add feedback