AITopics | Discourse & Dialogue

Collaborating Authors

Discourse & Dialogue

Understanding Language in Conversations "The problems addressed in discourse research aim to answer two general kinds of questions: (1) what information is contained in extended sequences of utterances that goes beyond the meaning of the individual utterances themselves? (2) how does the context in which an utterance is used affect the meaning of the individual utterances, or parts of them?"
– Barbara Grosz. Overview of Chapter 6: Discourse and Dialogue, Survey of the State of the Art in Human Language Technology (1996).

News Overviews Instructional Materials AI-Alerts Classics

Sentiment Analysis for Measuring Hope and Fear from Reddit Posts During the 2022 Russo-Ukrainian Conflict

Guerra, Alessio, Karakuş, Oktay

arXiv.org Artificial IntelligenceJan-19-2023

This paper proposes a novel lexicon-based unsupervised sentimental analysis method to measure the $``\textit{hope}"$ and $``\textit{fear}"$ for the 2022 Ukrainian-Russian Conflict. $\textit{Reddit.com}$ is utilised as the main source of human reactions to daily events during nearly the first three months of the conflict. The top 50 $``hot"$ posts of six different subreddits about Ukraine and news (Ukraine, worldnews, Ukraina, UkrainianConflict, UkraineWarVideoReport, UkraineWarReports) and their relative comments are scraped and a data set is created. On this corpus, multiple analyses such as (1) public interest, (2) hope/fear score, (3) stock price interaction are employed. We promote using a dictionary approach, which scores the hopefulness of every submitted user post. The Latent Dirichlet Allocation (LDA) algorithm of topic modelling is also utilised to understand the main issues raised by users and what are the key talking points. Experimental analysis shows that the hope strongly decreases after the symbolic and strategic losses of Azovstal (Mariupol) and Severodonetsk. Spikes in hope/fear, both positives and negatives, are present after important battles, but also some non-military events, such as Eurovision and football games.

machine learning, measuring hope, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.3389/frai.2023.1163577

2301.08347

Country:

Asia > Russia (1.00)
Europe > Ukraine > Donetsk Oblast > Mariupol (0.24)

Genre: Research Report > Experimental Study (0.46)

Industry:

Media > News (1.00)
Leisure & Entertainment (1.00)
Government > Military (1.00)
(4 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Add feedback

A Quantitative Exploration of Natural Language Processing Applications for Electricity Demand Analysis

Bai, Yun, Camal, Simon, Michiorri, Andrea

arXiv.org Artificial IntelligenceJan-18-2023

The relationship between electricity demand and weather has been established for a long time and is one of the cornerstones in load prediction for operation and planning, along with behavioral and social aspects such as calendars or significant events. This paper explores how and why the social information contained in the news can be used better to understand aggregate population behaviour in terms of energy demand. The work is done through experiments analysing the impact of predicting features extracted from national news on day-ahead electric demand prediction. The results are compared to a benchmark model trained exclusively on the calendar and meteorological information. Experimental results showed that the best-performing model reduced the official standard errors around 4%, 11%, and 10% in terms of RMSE, MAE, and SMAPE. The best-performing methods are: word frequency identified COVID-19-related keywords; topic distribution that identified news on the pandemic and internal politics; global word embeddings that identified news about international conflicts. This study brings a new perspective to traditional electricity demand analysis and confirms the feasibility of improving its predictions with unstructured information contained in texts, with potential consequences in sociology and economics.

forecasting, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2301.07535

Country: Europe > United Kingdom (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Energy > Power Industry (1.00)
Energy > Oil & Gas (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.69)

Add feedback

KILDST: Effective Knowledge-Integrated Learning for Dialogue State Tracking using Gazetteer and Speaker Information

Choi, Hyungtak, Ko, Hyeonmok, Kaur, Gurpreet, Ravuru, Lohith, Gandikota, Kiranmayi, Jhawar, Manisha, Dharani, Simma, Patil, Pranamya

arXiv.org Artificial IntelligenceJan-18-2023

Dialogue State Tracking (DST) is core research in dialogue systems and has received much attention. In addition, it is necessary to define a new problem that can deal with dialogue between users as a step toward the conversational AI that extracts and recommends information from the dialogue between users. So, we introduce a new task - DST from dialogue between users about scheduling an event (DST-USERS). The DST-USERS task is much more challenging since it requires the model to understand and track dialogue states in the dialogue between users and to understand who suggested the schedule and who agreed to the proposed schedule. To facilitate DST-USERS research, we develop dialogue datasets between users that plan a schedule. The annotated slot values which need to be extracted in the dialogue are date, time, and location. Previous approaches, such as Machine Reading Comprehension (MRC) and traditional DST techniques, have not achieved good results in our extensive evaluations. By adopting the knowledge-integrated learning method, we achieve exceptional results. The proposed model architecture combines gazetteer features and speaker information efficiently. Our evaluations of the dialogue datasets between users that plan a schedule show that our model outperforms the baseline model.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2301.07341

Country:

Asia > India > Karnataka > Bengaluru (0.05)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)

Add feedback

Summative Student Course Review Tool Based on Machine Learning Sentiment Analysis to Enhance Life Science Feedback Efficacy

Hoar, Ben, Ramachandran, Roshini, Levis, Marc, Sparck, Erin, Wu, Ke, Liu, Chong

arXiv.org Artificial IntelligenceJan-15-2023

Machine learning enables the development of new, supplemental, and empowering tools that can either expand existing technologies or invent new ones. In education, space exists for a tool that supports generic student course review formats to organize and recapitulate students' views on the pedagogical practices to which they are exposed. Often, student opinions are gathered with a general comment section that solicits their feelings towards their courses without polling specifics about course contents. Herein, we show a novel approach to summarizing and organizing students' opinions via analyzing their sentiment towards a course as a function of the language/vocabulary used to convey their opinions about a class and its contents. This analysis is derived from their responses to a general comment section encountered at the end of post-course review surveys. This analysis, accomplished with Python, LaTeX, and Google's Natural Language API, allows for the conversion of unstructured text data into both general and topic-specific sub-reports that convey students' views in a unique, novel way.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2301.06173

Country:

South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.54)

Add feedback

Natural Language Processing of Aviation Occurrence Reports for Safety Management

Jonk, Patrick, de Vries, Vincent, Wever, Rombout, Sidiropoulos, Georgios, Kanoulas, Evangelos

arXiv.org Artificial IntelligenceJan-13-2023

Occurrence reporting is a commonly used method in safety management systems to obtain insight in the prevalence of hazards and accident scenarios. In support of safety data analysis, reports are often categorized according to a taxonomy. However, the processing of the reports can require significant effort from safety analysts and a common problem is interrater variability in labeling processes. Also, in some cases, reports are not processed according to a taxonomy, or the taxonomy does not fully cover the contents of the documents. This paper explores various Natural Language Processing (NLP) methods to support the analysis of aviation safety occurrence reports. In particular, the problems studied are the automatic labeling of reports using a classification model, extracting the latent topics in a collection of texts using a topic model and the automatic generation of probable cause texts. Experimental results showed that (i) under the right conditions the labeling of occurrence reports can be effectively automated with a transformer-based classifier, (ii) topic modeling can be useful for finding the topics present in a collection of reports, and (iii) using a summarization model can be a promising direction for generating probable cause texts.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.3850/978-981-18-5183-4_S05-06-449-cd

2301.05663

Country:

North America > United States (1.00)
North America > Canada (0.05)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(3 more...)

Genre: Research Report > New Finding (0.86)

Industry:

Transportation > Air (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.49)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Adaptation of domain-specific transformer models with text oversampling for sentiment analysis of social media posts on Covid-19 vaccines

Bansal, Anmol, Choudhry, Arjun, Sharma, Anubhav, Susan, Seba

arXiv.org Artificial IntelligenceJan-13-2023

Covid-19 has spread across the world and several vaccines have been developed to counter its surge. To identify the correct sentiments associated with the vaccines from social media posts, we fine-tune various state-of-the-art pre-trained transformer models on tweets associated with Covid-19 vaccines. Specifically, we use the recently introduced state-of-the-art pre-trained transformer models RoBERTa, XLNet and BERT, and the domain-specific transformer models CT-BERT and BERTweet that are pre-trained on Covid-19 tweets. We further explore the option of text augmentation by oversampling using Language Model based Oversampling Technique (LMOTE) to improve the accuracies of these models, specifically, for small sample datasets where there is an imbalanced class distribution among the positive, negative and neutral sentiment classes. Our results summarize our findings on the suitability of text oversampling for imbalanced small sample datasets that are used to fine-tune state-of-the-art pre-trained transformer models, and the utility of domain-specific transformer models for the classification task.

artificial intelligence, domain-specific transformer model, natural language, (5 more...)

arXiv.org Artificial Intelligence

2209.10966

Genre: Research Report > New Finding (0.53)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.40)

Add feedback

Topics in Contextualised Attention Embeddings

Talebpour, Mozhgan, de Herrera, Alba Garcia Seco, Jameel, Shoaib

arXiv.org Artificial IntelligenceJan-11-2023

Contextualised word vectors obtained via pre-trained language models encode a variety of knowledge that has already been exploited in applications. Complementary to these language models are probabilistic topic models that learn thematic patterns from the text. Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation. The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics. To address this question, we design different probe experiments. Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters. We strongly believe that our work paves way for further research into the relationships between probabilistic topic models and pre-trained language models.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2301.04339

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Common NLP tasks and techniques. Sentiment analysis

#artificialintelligenceJan-10-2023, 09:55:06 GMT

Sentiment analysis, also known as opinion mining, is the process of determining the attitude or emotion of the writer towards a particular topic. Sentiment analysis is used in a wide range of applications, such as market research, brand management, and political analysis. There are several techniques used in sentiment analysis, including lexicon-based methods, which use a predefined list of words and their associated sentiment, and machine learning-based methods, which use algorithms to learn patterns in the data. Text classification is the process of automatically categorizing text into predefined categories or labels. Text classification has a wide range of applications, including spam detection, sentiment analysis, and topic classification.

artificial intelligence, classification, natural language, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.90)

Add feedback

User-Centered Security in Natural Language Processing

Emmery, Chris

arXiv.org Artificial IntelligenceJan-10-2023

This dissertation proposes a framework of user-centered security in Natural Language Processing (NLP), and demonstrates how it can improve the accessibility of related research. Accordingly, it focuses on two security domains within NLP with great public interest. First, that of author profiling, which can be employed to compromise online privacy through invasive inferences. Without access and detailed insight into these models' predictions, there is no reasonable heuristic by which Internet users might defend themselves from such inferences. Secondly, that of cyberbullying detection, which by default presupposes a centralized implementation; i.e., content moderation across social platforms. As access to appropriate data is restricted, and the nature of the task rapidly evolves (both through lexical variation, and cultural shifts), the effectiveness of its classifiers is greatly diminished and thereby often misrepresented. Under the proposed framework, we predominantly investigate the use of adversarial attacks on language; i.e., changing a given input (generating adversarial samples) such that a given model does not function as intended. These attacks form a common thread between our user-centered security problems; they are highly relevant for privacy-preserving obfuscation methods against author profiling, and adversarial samples might also prove useful to assess the influence of lexical variation and augmentation on cyberbullying detection.

large language model, machine learning, text classification, (21 more...)

arXiv.org Artificial Intelligence

2301.0423

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Maryland > Baltimore (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
(105 more...)

Genre:

Research Report > New Finding (1.00)
Overview (0.93)
Instructional Material > Course Syllabus & Notes (0.67)
(2 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(7 more...)

Add feedback

Understanding the Complexity and Its Impact on Testing in ML-Enabled Systems

Cao, Junming, Chen, Bihuan, Hu, Longjie, Gao, Jie, Huang, Kaifeng, Peng, Xin

arXiv.org Artificial IntelligenceJan-10-2023

Machine learning (ML) enabled systems are emerging with recent breakthroughs in ML. A model-centric view is widely taken by the literature to focus only on the analysis of ML models. However, only a small body of work takes a system view that looks at how ML components work with the system and how they affect software engineering for MLenabled systems. In this paper, we adopt this system view, and conduct a case study on Rasa 3.0, an industrial dialogue system that has been widely adopted by various companies around the world. Our goal is to characterize the complexity of such a largescale ML-enabled system and to understand the impact of the complexity on testing. Our study reveals practical implications for software engineering for ML-enabled systems.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2301.03837

Country:

Asia > Singapore (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (1.00)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)

Add feedback