Goto

Collaborating Authors

 Discourse & Dialogue


Social Media Sentiment Analysis Using Twitter Datasets - DataScienceCentral.com

#artificialintelligence

Several hundreds of thousands of raw data files are uploaded by users every day to social media sites. Online user data provides access to an enormous amount of information regarding products, services, places, and events, which makes it suitable for sentiment analysis. Valuable information can be extracted by analyzing the sentiment of the data. It is a method for interpreting opinions within a text that uses Natural Language Processing (NLP) to extract positive, negative, and natural meanings from user-generated content shared on social media platforms. Sentiment analysis has been previously applied to products or movie reviews to understand customers' interests better and, thus, improve outcomes and service offerings.


Human Review Workflow with AWS A2I

#artificialintelligence

Some deep learning and machine learning applications need to ensure accuracy through human oversight of sensitive data. This provides the collection of health data, increases the model accuracy, and helps continuous improvements with updated predictions. Data augmentation is a critical process for data companies that spend tons of dollars on this. Today, we will create a sentiment analysis workflow on Amazon Human Review Workflow, an Amazon A2I service.


Unsupervised Learning of Hierarchical Conversation Structure

arXiv.org Artificial Intelligence

Human conversations can evolve in many different ways, creating challenges for automatic understanding and summarization. Goal-oriented conversations often have meaningful sub-dialogue structure, but it can be highly domain-dependent. This work introduces an unsupervised approach to learning hierarchical conversation structure, including turn and sub-dialogue segment labels, corresponding roughly to dialogue acts and sub-tasks, respectively. The decoded structure is shown to be useful in enhancing neural models of language for three conversation-level understanding tasks. Further, the learned finite-state sub-dialogue network is made interpretable through automatic summarization.


BERT-ASC: Implicit Aspect Representation Learning through Auxiliary-Sentence Construction for Sentiment Analysis

arXiv.org Artificial Intelligence

Aspect-based sentiment analysis (ABSA) task aim at associating a piece of text with a set of aspects and meanwhile infer their respective sentimental polarities. The state-of-the-art approaches are built upon fine-tuning of various pre-trained language models. They commonly attempt to learn aspect-specific representation from the corpus. Unfortunately, the aspect is often expressed implicitly through a set of representatives and thus renders implicit mapping process unattainable unless sufficient labeled examples are available. However, high-quality labeled examples may not be readily available in real-world scenarios. In this paper, we propose to jointly address aspect categorization and aspect-based sentiment subtasks in a unified framework. Specifically, we first introduce a simple but effective mechanism to construct an auxiliary-sentence for the implicit aspect based on the semantic information in the corpus. Then, we encourage BERT to learn the aspect-specific representation in response to the automatically constructed auxiliary-sentence instead of the aspect itself. Finally, we empirically evaluate the performance of the proposed solution by a comparative study on real benchmark datasets for both ABSA and Targeted-ABSA tasks. Our extensive experiments show that it consistently achieves state-of-the-art performance in terms of aspect categorization and aspect-based sentiment across all datasets and the improvement margins are considerable. The code of BERT-ASC is available in GitHub: https://github.com/amurtadha/BERT-ASC.


Self-Training with Purpose Preserving Augmentation Improves Few-shot Generative Dialogue State Tracking

arXiv.org Artificial Intelligence

In dialogue state tracking (DST), labeling the dataset involves considerable human labor. We propose a new self-training framework for fewshot generative DST that utilize unlabeled data. Our self-training method iteratively improves the model by pseudo labeling and employs Purpose Preserving augmentation (PPaug) to prevent overfitting. We increase the few-shot (10%) performance by approximately 4% on Figure 1: Dialogue example of DST dataset and its belief MultiWOZ 2.1 (Eric et al., 2019) and enhances state. The underlined part of the dialogue is the the slot-recall 8.34% for unseen values compared value of the belief state and has specific information to baseline.


CDialog: A Multi-turn Covid-19 Conversation Dataset for Entity-Aware Dialog Generation

arXiv.org Artificial Intelligence

The development of conversational agents to interact with patients and deliver clinical advice has attracted the interest of many researchers, particularly in light of the COVID-19 pandemic. The training of an end-to-end neural based dialog system, on the other hand, is hampered by a lack of multi-turn medical dialog corpus. We make the very first attempt to release a high-quality multi-turn Medical Dialog dataset relating to Covid-19 disease named CDialog, with over 1K conversations collected from the online medical counselling websites. We annotate each utterance of the conversation with seven different categories of medical entities, including diseases, symptoms, medical tests, medical history, remedies, medications and other aspects as additional labels. Finally, we propose a novel neural medical dialog system based on the CDialog dataset to advance future research on developing automated medical dialog systems. We use pre-trained language models for dialogue generation, incorporating annotated medical entities, to generate a virtual doctor's response that addresses the patient's query. Experimental results show that the proposed dialog models perform comparably better when supplemented with entity information and hence can improve the response quality.


Navigating Connected Memories with a Task-oriented Dialog System

arXiv.org Artificial Intelligence

Recent years have seen an increasing trend in the volume of personal media captured by users, thanks to the advent of smartphones and smart glasses, resulting in large media collections. Despite conversation being an intuitive human-computer interface, current efforts focus mostly on single-shot natural language based media retrieval to aid users query their media and re-live their memories. This severely limits the search functionality as users can neither ask follow-up queries nor obtain information without first formulating a single-turn query. In this work, we propose dialogs for connected memories as a powerful tool to empower users to search their media collection through a multi-turn, interactive conversation. Towards this, we collect a new task-oriented dialog dataset COMET, which contains $11.5k$ user<->assistant dialogs (totaling $103k$ utterances), grounded in simulated personal memory graphs. We employ a resource-efficient, two-phase data collection pipeline that uses: (1) a novel multimodal dialog simulator that generates synthetic dialog flows grounded in memory graphs, and, (2) manual paraphrasing to obtain natural language utterances. We analyze COMET, formulate four main tasks to benchmark meaningful progress, and adopt state-of-the-art language models as strong baselines, in order to highlight the multimodal challenges captured by our dataset.


Multilingual and Multimodal Topic Modelling with Pretrained Embeddings

arXiv.org Artificial Intelligence

This paper presents M3L-Contrast -- a novel multimodal multilingual (M3L) neural topic model for comparable data that maps texts from multiple languages and images into a shared topic space. Our model is trained jointly on texts and images and takes advantage of pretrained document and image embeddings to abstract the complexities between different languages and modalities. As a multilingual topic model, it produces aligned language-specific topics and as multimodal model, it infers textual representations of semantic concepts in images. We demonstrate that our model is competitive with a zero-shot topic model in predicting topic distributions for comparable multilingual data and significantly outperforms a zero-shot model in predicting topic distributions for comparable texts and images. We also show that our model performs almost as well on unaligned embeddings as it does on aligned embeddings.


Using Open-Ended Stressor Responses to Predict Depressive Symptoms across Demographics

arXiv.org Artificial Intelligence

Stressors are related to depression, but this relationship is complex. We investigate the relationship between open-ended text responses about stressors and depressive symptoms across gender and racial/ethnic groups. First, we use topic models and other NLP tools to find thematic and vocabulary differences when reporting stressors across demographic groups. We train language models using self-reported stressors to predict depressive symptoms, finding a relationship between stressors and depression. Finally, we find that differences in stressors translate to downstream performance differences across demographic groups.


Generative Aspect-Based Sentiment Analysis with Contrastive Learning and Expressive Structure

arXiv.org Artificial Intelligence

Generative models have demonstrated impressive results on Aspect-based Sentiment Analysis (ABSA) tasks, particularly for the emerging task of extracting Aspect-Category-Opinion-Sentiment (ACOS) quadruples. However, these models struggle with implicit sentiment expressions, which are commonly observed in opinionated content such as online reviews. In this work, we introduce GEN-SCL-NAT, which consists of two techniques for improved structured generation for ACOS quadruple extraction. First, we propose GEN-SCL, a supervised contrastive learning objective that aids quadruple prediction by encouraging the model to produce input representations that are discriminable across key input attributes, such as sentiment polarity and the existence of implicit opinions and aspects. Second, we introduce GEN-NAT, a new structured generation format that better adapts autoregressive encoder-decoder models to extract quadruples in a generative fashion. Experimental results show that GEN-SCL-NAT achieves top performance across three ACOS datasets, averaging 1.48% F1 improvement, with a maximum 1.73% increase on the LAPTOP-L1 dataset. Additionally, we see significant gains on implicit aspect and opinion splits that have been shown as challenging for existing ACOS approaches.