Information Extraction
Making sense of electrical vehicle discussions using sentiment analysis on closely related news and user comments
Electric Vehicles (EVs) are a rapidly growing component of the automotive industry and are projected to have over 30 percent of the overall United States light duty vehicle market by 2030 (Wolinetz and Axsen, 2017). It's very different from traditional researches realated to transportation about road conditions (Huang et al., 2019), aviation (Bauranov et al., 2021) and manned driving (Chai et al., 2021). Furthermore, the US and other countries have bet big on Battery Electric Vehicles (BEVs), allotting funding for charging infrastructure, subsidies and tax credits and setting deadlines to phase out combustion engine vehicles. Correspondingly, the stock price of EV companies like Tesla have recently far exceeded those of traditional auto manufacturers, helping to illustrate the bullish outlook many consumers and investors have toward EVs in general. Despite this, there remain concerns among both consumers and experts about various aspects of electric cars, and despite the excitement surrounding them, EV adoption rates hovered around 1.8% in 2020 (energy.gov,
Sentiment Analysis with Deep Learning Models: A Comparative Study on a Decade of Sinhala Language Facebook Data
Weeraprameshwara, Gihan, Jayawickrama, Vihanga, de Silva, Nisansa, Wijeratne, Yudhanjaya
The relationship between Facebook posts and the corresponding reaction feature is an interesting subject to explore and understand. To achieve this end, we test state-of-the-art Sinhala sentiment analysis models against a data set containing a decade worth of Sinhala posts with millions of reactions. For the purpose of establishing benchmarks and with the goal of identifying the best model for Sinhala sentiment analysis, we also test, on the same data set configuration, other deep learning models catered for sentiment analysis. In this study we report that the 3 layer Bidirectional LSTM model achieves an F1 score of 84.58% for Sinhala sentiment analysis, surpassing the current state-of-the-art model; Capsule B, which only manages to get an F1 score of 82.04%. Further, since all the deep learning models show F1 scores above 75% we conclude that it is safe to claim that Facebook reactions are suitable to predict the sentiment of a text.
CLIP-Event: Connecting Text and Images with Event Structures
Li, Manling, Xu, Ruochen, Wang, Shuohang, Zhou, Luowei, Lin, Xudong, Zhu, Chenguang, Zeng, Michael, Ji, Heng, Chang, Shih-Fu
Vision-language (V+L) pretraining models have achieved great success in supporting multimedia applications by understanding the alignments between images and text. While existing vision-language pretraining models primarily focus on understanding objects in images or entities in text, they often ignore the alignment at the level of events and their argument structures. % In this work, we propose a contrastive learning framework to enforce vision-language pretraining models to comprehend events and associated argument (participant) roles. To achieve this, we take advantage of text information extraction technologies to obtain event structural knowledge, and utilize multiple prompt functions to contrast difficult negative descriptions by manipulating event structures. We also design an event graph alignment loss based on optimal transport to capture event argument structures. In addition, we collect a large event-rich dataset (106,875 images) for pretraining, which provides a more challenging image retrieval benchmark to assess the understanding of complicated lengthy sentences. Experiments show that our zero-shot CLIP-Event outperforms the state-of-the-art supervised model in argument extraction on Multimedia Event Extraction, achieving more than 5\% absolute F-score gain in event extraction, as well as significant improvements on a variety of downstream tasks under zero-shot settings.
Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-based Sentiment Analysis
Zhong, Qihuang, Ding, Liang, Liu, Juhua, Du, Bo, Jin, Hua, Tao, Dacheng
Aspect-based sentiment analysis (ABSA) is a fine-grained task of sentiment analysis. To better comprehend long complicated sentences and obtain accurate aspect-specific information, linguistic and commonsense knowledge are generally required in this task. However, most methods employ complicated and inefficient approaches to incorporate external knowledge, e.g., directly searching the graph nodes. Additionally, the complementarity between external knowledge and linguistic information has not been thoroughly studied. To this end, we propose a knowledge graph augmented network (KGAN), which aims to effectively incorporate external knowledge with explicitly syntactic and contextual information. In particular, KGAN captures the sentiment feature representations from multiple different perspectives, i.e., context-, syntax- and knowledge-based. First, KGAN learns the contextual and syntactic representations in parallel to fully extract the semantic features. Then, KGAN integrates the knowledge graphs into the embedding space, based on which the aspect-specific knowledge representations are further obtained via an attention mechanism. Last, we propose a hierarchical fusion module to complement these multiview representations in a local-to-global manner. Extensive experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN. Notably, with the help of the pretrained model of RoBERTa, KGAN achieves a new record of state-of-the-art performance.
Monitoring Energy Trends through Automatic Information Extraction
Energy research is of crucial public importance but the use of computer science technologies like automatic text processing and data management for the energy domain is still rare. Employing these technologies in the energy domain will be a significant contribution to the interdisciplinary topic of ``energy informatics", just like the related progress within the interdisciplinary area of ``bioinformatics". In this paper, we present the architecture of a Web-based semantic system called EneMonIE (Energy Monitoring through Information Extraction) for monitoring up-to-date energy trends through the use of automatic, continuous, and guided information extraction from diverse types of media available on the Web. The types of media handled by the system will include online news articles, social media texts, online news videos, and open-access scholarly papers and technical reports as well as various numeric energy data made publicly available by energy organizations. The system will utilize and contribute to the energy-related ontologies and its ultimate form will comprise components for (i) text categorization, (ii) named entity recognition, (iii) temporal expression extraction, (iv) event extraction, (v) social network construction, (vi) sentiment analysis, (vii) information fusion and summarization, (viii) media interlinking, and (ix) Web-based information retrieval and visualization. Wits its diverse data sources, automatic text processing capabilities, and presentation facilities open for public use; EneMonIE will be an important source of distilled and concise information for decision-makers including energy generation, transmission, and distribution system operators, energy research centres, related investors and entrepreneurs as well as for academicians, students, other individuals interested in the pace of energy events and technologies.
Auto-ABSA: Automatic Detection of Aspects in Aspect-Based Sentiment Analysis
After transformer is proposed, lots of pre-trained language models have been come up with and sentiment analysis (SA) task has been improved. In this paper, we proposed a method that uses an auxiliary sentence about aspects that the sentence contains to help sentiment prediction. The first is aspect detection, which uses a multi-aspects detection model to predict all aspects that the sentence has. Combining the predicted aspects and the original sentence as Sentiment Analysis (SA) model's input. The second is to do out-of-domain aspect-based sentiment analysis(ABSA), train sentiment classification model with one kind of dataset and validate it with another kind of dataset. Finally, we created two baselines, they use no aspect and all aspects as sentiment classification model's input, respectively. Compare two baselines performance to our method, found that our method really makes sense.
Twitter Sentiment: Bears at Seahawks, Week 16, 2021
We've been doing a lot of NLP Sentiment Analysis on NFL games recently. So far, the team with the higher pregame Twitter sentiment has won 4 out of 10 analyses with 2 Week 16 games finished at the time of writing: Lions at Falcons, and Chargers at Texans. For week 16, we're going to analyze all the games and…
Text Analytics with Python: A Practitioner's Guide to Natural Language Processing: Sarkar, Dipanjan: 9781484243534: Amazon.com: Books
Leverage Natural Language Processing (NLP) in Python and learn how to set up your own robust environment for performing text analytics. This second edition has gone through a major revamp and introduces several significant changes and new topics based on the recent trends in NLP. You'll see how to use the latest state-of-the-art frameworks in NLP, coupled with machine learning and deep learning models for supervised sentiment analysis powered by Python to solve actual case studies. Improved techniques and new methods around parsing and processing text are discussed as well. There is also a chapter dedicated to semantic analysis where you'll see how to build your own named entity recognition (NER) system from scratch.
Analyzing Scientific Publications using Domain-Specific Word Embedding and Topic Modelling
Singhal, Trisha, Liu, Junhua, Blessing, Lucienne T. M., Lim, Kwan Hui
The scientific world is changing at a rapid pace, with new technology being developed and new trends being set at an increasing frequency. This paper presents a framework for conducting scientific analyses of academic publications, which is crucial to monitor research trends and identify potential innovations. This framework adopts and combines various techniques of Natural Language Processing, such as word embedding and topic modelling. Word embedding is used to capture semantic meanings of domain-specific words. We propose two novel scientific publication embedding, i.e., PUB-G and PUB-W, which are capable of learning semantic meanings of general as well as domain-specific words in various research fields. Thereafter, topic modelling is used to identify clusters of research topics within these larger research fields. We curated a publication dataset consisting of two conferences and two journals from 1995 to 2020 from two research domains. Experimental results show that our PUB-G and PUB-W embeddings are superior in comparison to other baseline embeddings by a margin of ~0.18-1.03 based on topic coherence.