Information Extraction
Sentiment Analysis of ESG disclosures on Stock Market
Bapat, Sudeep R., Kothari, Saumya, Bansal, Rushil
In this paper, we look at the impact of Environment, Social and Governance related news articles and social media data on the stock market performance. We pick four stocks of companies which are widely known in their domain to understand the complete effect of ESG as the newly opted investment style remains restricted to only the stocks with widespread information. We summarise live data of both twitter tweets and newspaper articles and create a sentiment index using a dictionary technique based on online information for the month of July, 2022. We look at the stock price data for all the four companies and calculate the percentage change in each of them. We also compare the overall sentiment of the company to its percentage change over a specific historical period.
Longitudinal Sentiment Analyses for Radicalization Research: Intertemporal Dynamics on Social Media Platforms and their Implications
This discussion paper demonstrates how longitudinal sentiment analyses can depict intertemporal dynamics on social media platforms, what challenges are inherent and how further research could benefit from a longitudinal perspective. Furthermore and since tools for sentiment analyses shall simplify and accelerate the analytical process regarding qualitative data at acceptable inter-rater reliability, their applicability in the context of radicalization research will be examined regarding the Tweets collected on January 6th 2021, the day of the storming of the U.S. Capitol in Washington. Therefore, a total of 49,350 Tweets will be analyzed evenly distributed within three different sequences: before, during and after the U.S. Capitol in Washington was stormed. These sequences highlight the intertemporal dynamics within comments on social media platforms as well as the possible benefits of a longitudinal perspective when using conditional means and conditional variances. Limitations regarding the identification of supporters of such events and associated hate speech as well as common application errors will be demonstrated as well. As a result, only under certain conditions a longitudinal sentiment analysis can increase the accuracy of evidence based predictions in the context of radicalization research.
A Survey: Credit Sentiment Score Prediction
Alam, A. N. M. Sajedul, Kibria, Junaid Bin, Dey, Arnob Kumar, Alam, Zawad, Zaman, Shifat, Mahtab, Motahar, Mahbub, Mohammed Julfikar Ali, Rasel, Annajiat Alim
Manual approvals are still used by banks and other NGOs to approve loans. It takes time and is prone to mistakes because it is controlled by a bank employee. Several fields of machine learning mining technologies have been utilized to enhance various areas of credit rating forecast. A major goal of this research is to look at current sentiment analysis techniques that are being used to generate creditworthiness.
From Theories on Styles to their Transfer in Text: Bridging the Gap with a Hierarchical Survey
Troiano, Enrica, Velutharambath, Aswathy, Klinger, Roman
Humans are naturally endowed with the ability to write in a particular style. They can, for instance, re-phrase a formal letter in an informal way, convey a literal message with the use of figures of speech or edit a novel by mimicking the style of some well-known authors. Automating this form of creativity constitutes the goal of style transfer. As a natural language generation task, style transfer aims at rewriting existing texts, and specifically, it creates paraphrases that exhibit some desired stylistic attributes. From a practical perspective, it envisions beneficial applications, like chatbots that modulate their communicative style to appear empathetic, or systems that automatically simplify technical articles for a non-expert audience. Several style-aware paraphrasing methods have attempted to tackle style transfer. A handful of surveys give a methodological overview of the field, but they do not support researchers to focus on specific styles. With this paper, we aim at providing a comprehensive discussion of the styles that have received attention in the transfer task. We organize them in a hierarchy, highlighting the challenges for the definition of each of them, and pointing out gaps in the current research landscape. The hierarchy comprises two main groups. One encompasses styles that people modulate arbitrarily, along the lines of registers and genres. The other group corresponds to unintentionally expressed styles, due to an author's personal characteristics. Hence, our review shows how these groups relate to one another, and where specific styles, including some that have not yet been explored, belong in the hierarchy. Moreover, we summarize the methods employed for different stylistic families, hinting researchers towards those that would be the most fitting for future research.
Happy or grumpy? A Machine Learning Approach to Analyze the Sentiment of Airline Passengers' Tweets
As one of the most extensive social networking services, Twitter has more than 300 million active users as of 2022. Among its many functions, Twitter is now one of the go-to platforms for consumers to share their opinions about products or experiences, including flight services provided by commercial airlines. This study aims to measure customer satisfaction by analyzing sentiments of Tweets that mention airlines using a machine learning approach. Relevant Tweets are retrieved from Twitter's API and processed through tokenization and vectorization. After that, these processed vectors are passed into a pre-trained machine learning classifier to predict the sentiments. In addition to sentiment analysis, we also perform lexical analysis on the collected Tweets to model keywords' frequencies, which provide meaningful contexts to facilitate the interpretation of sentiments. We then apply time series methods such as Bollinger Bands to detect abnormalities in sentiment data. Using historical records from January to July 2022, our approach is proven to be capable of capturing sudden and significant changes in passengers' sentiment. This study has the potential to be developed into an application that can help airlines, along with several other customer-facing businesses, efficiently detect abrupt changes in customers' sentiments and take adequate measures to counteract them.
Lex2Sent: A bagging approach to unsupervised sentiment analysis
Lange, Kai-Robin, Rieger, Jonas, Jentsch, Carsten
Unsupervised sentiment analysis is traditionally performed by counting those words in a text that are stored in a sentiment lexicon and then assigning a label depending on the proportion of positive and negative words registered. While these "counting" methods are considered to be beneficial as they rate a text deterministically, their classification rates decrease when the analyzed texts are short or the vocabulary differs from what the lexicon considers default. The model proposed in this paper, called Lex2Sent, is an unsupervised sentiment analysis method to improve the classification of sentiment lexicon methods. For this purpose, a Doc2Vec-model is trained to determine the distances between document embeddings and the embeddings of the positive and negative part of a sentiment lexicon. These distances are then evaluated for multiple executions of Doc2Vec on resampled documents and are averaged to perform the classification task. For three benchmark datasets considered in this paper, the proposed Lex2Sent outperforms every evaluated lexicon, including state-of-the-art lexica like VADER or the Opinion Lexicon in terms of classification rate.
TSAM: A Two-Stream Attention Model for Causal Emotion Entailment
Zhang, Duzhen, Yang, Zhen, Meng, Fandong, Chen, Xiuyi, Zhou, Jie
Causal Emotion Entailment (CEE) aims to discover the potential causes behind an emotion in a conversational utterance. Previous works formalize CEE as independent utterance pair classification problems, with emotion and speaker information neglected. From a new perspective, this paper considers CEE in a joint framework. We classify multiple utterances synchronously to capture the correlations between utterances in a global view and propose a Two-Stream Attention Model (TSAM) to effectively model the speaker's emotional influences in the conversational history. Specifically, the TSAM comprises three modules: Emotion Attention Network (EAN), Speaker Attention Network (SAN), and interaction module. The EAN and SAN incorporate emotion and speaker information in parallel, and the subsequent interaction module effectively interchanges relevant information between the EAN and SAN via a mutual BiAffine transformation. Extensive experimental results demonstrate that our model achieves new State-Of-The-Art (SOT A) performance and outperforms baselines remarkably.
Identifying Offensive Expressions of Opinion in Context
Vargas, Francielle Alves, Carvalho, Isabelle, de Gรณes, Fabiana Rodrigues
Classic information extraction techniques consist in building questions and answers about the facts. Indeed, it is still a challenge to subjective information extraction systems to identify opinions and feelings in context. In sentiment-based NLP tasks, there are few resources to information extraction, above all offensive or hateful opinions in context. To fill this important gap, this short paper provides a new cross-lingual and contextual offensive lexicon, which consists of explicit and implicit offensive and swearing expressions of opinion, which were annotated in two different classes: context dependent and context-independent offensive. In addition, we provide markers to identify hate speech. Annotation approach was evaluated at the expression-level and achieves high human inter-annotator agreement. The provided offensive lexicon is available in Portuguese and English languages.
Transition to Adulthood for Young People with Intellectual or Developmental Disabilities: Emotion Detection and Topic Modeling
Liu, Yan, Laricheva, Maria, Zhang, Chiyu, Boutet, Patrick, Chen, Guanyu, Tracey, Terence, Carenini, Giuseppe, Young, Richard
Transition to Adulthood is an essential life stage for many families. The prior research has shown that young people with intellectual or development disabil-ities (IDD) have more challenges than their peers. This study is to explore how to use natural language processing (NLP) methods, especially unsupervised machine learning, to assist psychologists to analyze emotions and sentiments and to use topic modeling to identify common issues and challenges that young people with IDD and their families have. Additionally, the results were compared to those obtained from young people without IDD who were in tran-sition to adulthood. The findings showed that NLP methods can be very useful for psychologists to analyze emotions, conduct cross-case analysis, and sum-marize key topics from conversational data. Our Python code is available at https://github.com/mlaricheva/emotion_topic_modeling.
Find the Funding: Entity Linking with Incomplete Funding Knowledge Bases
Aydin, Gizem, Tabatabaei, Seyed Amin, Tsatsaronis, Giorgios, Hasibi, Faegheh
Automatic extraction of funding information from academic articles adds significant value to industry and research communities, such as tracking research outcomes by funding organizations, profiling researchers and universities based on the received funding, and supporting open access policies. Two major challenges of identifying and linking funding entities are: (i) sparse graph structure of the Knowledge Base (KB), which makes the commonly used graph-based entity linking approaches suboptimal for the funding domain, (ii) missing entities in KB, which (unlike recent zero-shot approaches) requires marking entity mentions without KB entries as NIL. We propose an entity linking model that can perform NIL prediction and overcome data scarcity issues in a time and data-efficient manner. Our model builds on a transformer-based mention detection and bi-encoder model to perform entity linking. We show that our model outperforms strong existing baselines.