Information Extraction
Azure Cognitive Services Sentiment Analysis V3-- Using PySpark
Azure Cognitive Services Text Analytics is a great tool you can use to quickly evaluate a text data set for positive or negative sentiment. For example, a service provider can quickly and easily evaluate reviews as positive or negative and rank them based on the sentiment score detected. Today I'm going to go through how to use Azure Cognitive Services Text Analytics using Databricks PySpark Notebook to analyze the sentiment of COVID-19 Tweets and return sentiment scores and indicators as to whether it is a positive or negative tweet. Cognitive Services are a set of machine learning algorithms that Microsoft has developed to solve problems in the field of Artificial Intelligence (AI). Developers can consume these algorithms through standard REST calls over the Internet to the Cognitive Services APIs in their Apps, Websites, or Workflows.
Beyond Social Media Analytics: Understanding Human Behaviour and Deep Emotion using Self Structuring Incremental Machine Learning
This thesis develops a conceptual framework considering social data as representing the surface layer of a hierarchy of human social behaviours, needs and cognition which is employed to transform social data into representations that preserve social behaviours and their causalities. Based on this framework two platforms were built to capture insights from fast-paced and slow-paced social data. For fast-paced, a self-structuring and incremental learning technique was developed to automatically capture salient topics and corresponding dynamics over time. An event detection technique was developed to automatically monitor those identified topic pathways for significant fluctuations in social behaviours using multiple indicators such as volume and sentiment. This platform is demonstrated using two large datasets with over 1 million tweets. The separated topic pathways were representative of the key topics of each entity and coherent against topic coherence measures. Identified events were validated against contemporary events reported in news. Secondly for the slow-paced social data, a suite of new machine learning and natural language processing techniques were developed to automatically capture self-disclosed information of the individuals such as demographics, emotions and timeline of personal events. This platform was trialled on a large text corpus of over 4 million posts collected from online support groups. This was further extended to transform prostate cancer related online support group discussions into a multidimensional representation and investigated the self-disclosed quality of life of patients (and partners) against time, demographics and clinical factors. The capabilities of this extended platform have been demonstrated using a text corpus collected from 10 prostate cancer online support groups comprising of 609,960 prostate cancer discussions and 22,233 patients.
Forrester names SAS a Leader in AI-Based text analytics platforms (document focused).
According to Forrester, document capture options and mining text from images and cursive writing in multiple languages are key differentiators for document-focused enterprise text analytics platforms, which focus on longer documents, such as contracts, insurance claims, invoices, and purchase orders. The software helps users with sentiment analysis, trend analysis, data preparation and visualization and hybrid modeling approaches. The report states, "SAS Visual Text Analytics bolsters SAS' family of formidable analytics products. SAS Visual Text Analytics is one of [SAS'] several applications built on the SAS Viya platform, where all applications share data and model management, BI and analytics GUI and other microservices, resulting in consistent UX."
A Blast From the Past: Personalizing Predictions of Video-Induced Emotions using Personal Memories as Context
Dudzik, Bernd, Broekens, Joost, Neerincx, Mark, Hung, Hayley
A key challenge in the accurate prediction of viewers' emotional responses to video stimuli in real-world applications is accounting for person- and situation-specific variation. An important contextual influence shaping individuals' subjective experience of a video is the personal memories that it triggers in them. Prior research has found that this memory influence explains more variation in video-induced emotions than other contextual variables commonly used for personalizing predictions, such as viewers' demographics or personality. In this article, we show that (1) automatic analysis of text describing their video-triggered memories can account for variation in viewers' emotional responses, and (2) that combining such an analysis with that of a video's audiovisual content enhances the accuracy of automatic predictions. We discuss the relevance of these findings for improving on state of the art approaches to automated affective video analysis in personalized contexts.
Cross-language sentiment analysis of European Twitter messages duringthe COVID-19 pandemic
Kruspe, Anna, Häberle, Matthias, Kuhn, Iona, Zhu, Xiao Xiang
Social media data can be a very salient source of information during crises. User-generated messages provide a window into people's minds during such times, allowing us insights about their moods and opinions. Due to the vast amounts of such messages, a large-scale analysis of population-wide developments becomes possible. In this paper, we analyze Twitter messages (tweets) collected during the first months of the COVID-19 pandemic in Europe with regard to their sentiment. This is implemented with a neural network for sentiment analysis using multilingual sentence embeddings. We separate the results by country of origin, and correlate their temporal development with events in those countries. This allows us to study the effect of the situation on people's moods. We see, for example, that lockdown announcements correlate with a deterioration of mood in almost all surveyed countries, which recovers within a short time span.
Multi-Label Sentiment Analysis on 100 Languages with Dynamic Weighting for Label Imbalance
Yilmaz, Selim F., Kaynak, E. Batuhan, Koç, Aykut, Dibeklioğlu, Hamdi, Kozat, Suleyman S.
We investigate cross-lingual sentiment analysis, which has attracted significant attention due to its applications in various areas including market research, politics and social sciences. In particular, we introduce a sentiment analysis framework in multi-label setting as it obeys Plutchik wheel of emotions. We introduce a novel dynamic weighting method that balances the contribution from each class during training, unlike previous static weighting methods that assign non-changing weights based on their class frequency. Moreover, we adapt the focal loss that favors harder instances from single-label object recognition literature to our multi-label setting. Furthermore, we derive a method to choose optimal class-specific thresholds that maximize the macro-f1 score in linear time complexity. Through an extensive set of experiments, we show that our method obtains the state-of-the-art performance in 7 of 9 metrics in 3 different languages using a single model compared to the common baselines and the best-performing methods in the SemEval competition. We publicly share our code for our model, which can perform sentiment analysis in 100 languages, to facilitate further research.
Learning from students' perception on professors through opinion mining
Vargas-Calderón, Vladimir, Flórez, Juan S., Ardila, Leonel F., Parra-A., Nicolas, Camargo, Jorge E., Vargas, Nelson
Students' perception of classes measured through their opinions on teaching surveys allows to identify deficiencies and problems, both in the environment and in the learning methodologies. The purpose of this paper is to study, through sentiment analysis using natural language processing (NLP) and machine learning (ML) techniques, those opinions in order to identify topics that are relevant for students, as well as predicting the associated sentiment via polarity analysis. As a result, it is implemented, trained and tested two algorithms to predict the associated sentiment as well as the relevant topics of such opinions. The combination of both approaches then becomes useful to identify specific properties of the students' opinions associated with each sentiment label (positive, negative or neutral opinions) and topic. Furthermore, we explore the possibility that students' perception surveys are carried out without closed questions, relying on the information that students can provide through open questions where they express their opinions about their classes.
Portfolio
The Linguistic Universe of Hungarian Poet Endre Ady Gender Stereotypes of Hungarian Online Media Named Entities in Hungarian Online Media Growth Hacking with NLP and Sentiment Analysis - our 5-week course at Manning Publications Metaphor and National Identity Alternative conceptualization of the Treaty of Trianon - 2019, John Benjamins Publishing Company We helped the future…
Twitter Data Case Sparks Dispute, Delay Among EU Privacy Regulators
European Union privacy regulators are clashing over how much--if anything--to fine Twitter Inc. for its handling of a data breach disclosed last year, delaying progress of the most advanced cross-border privacy case involving a U.S. tech company under the EU's strict new privacy law. The dispute, disclosed in a statement Thursday from Ireland's Data Protection Commission, is one of the first major tests for enforcement of the EU's privacy law, known as GDPR, which took effect in 2018. It raises the specter of disagreements and...
SentiQ: A Probabilistic Logic Approach to Enhance Sentiment Analysis Tool Quality
Kouadri, Wissam Maamar, Benbernou, Salima, Ouziri, Mourad, Palpanas, Themis, Amor, Iheb Ben
The opinion expressed in various Web sites and social-media is an essential contributor to the decision making process of several organizations. Existing sentiment analysis tools aim to extract the polarity (i.e., positive, negative, neutral) from these opinionated contents. Despite the advance of the research in the field, sentiment analysis tools give \textit{inconsistent} polarities, which is harmful to business decisions. In this paper, we propose SentiQ, an unsupervised Markov logic Network-based approach that injects the semantic dimension in the tools through rules. It allows to detect and solve inconsistencies and then improves the overall accuracy of the tools. Preliminary experimental results demonstrate the usefulness of SentiQ.