AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Deep Context- and Relation-Aware Learning for Aspect-based Sentiment Analysis

Oh, Shinhyeok, Lee, Dongyub, Whang, Taesun, Park, IlNam, Seo, Gaeun, Kim, EungGyun, Kim, Harksoo

arXiv.org Artificial IntelligenceJun-7-2021

Existing works for aspect-based sentiment analysis (ABSA) have adopted a unified approach, which allows the interactive relations among subtasks. However, we observe that these methods tend to predict polarities based on the literal meaning of aspect and opinion terms and mainly consider relations implicitly among subtasks at the word level. In addition, identifying multiple aspect-opinion pairs with their polarities is much more challenging. Therefore, a comprehensive understanding of contextual information w.r.t. the aspect and opinion are further required in ABSA. In this paper, we propose Deep Contextualized Relation-Aware Network (DCRAN), which allows interactive relations among subtasks with deep contextual information based on two modules (i.e., Aspect and Opinion Propagation and Explicit Self-Supervised Strategies). Especially, we design novel self-supervised strategies for ABSA, which have strengths in dealing with multiple aspects. Experimental results show that DCRAN significantly outperforms previous state-of-the-art methods by large margins on three widely used benchmarks.

proceedings, relation, sentiment analysis, (16 more...)

arXiv.org Artificial Intelligence

2106.03806

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.73)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Shareholder activists demand reforms from Amazon, Google, and Facebook • Data Protection News

#artificialintelligenceJun-4-2021, 17:15:11 GMT

Investors and activists are presenting Alphabet, Amazon, Facebook, and Twitter with a list of shareholder resolutions this week that call for investigations into alleged racial bias in Amazon's facial recognition software and other surveillance products, stronger safeguards against the spread of disinformation on Facebook, and the establishment of stronger worker and human rights protections at all four companies. Shareholder advocates and activist allies held a press conference on Monday detailing several resolutions being presented this week and next to the boards of Alphabet, Amazon, Facebook, and Twitter. While the advocates didn't expect the resolutions to pass -- some of the company boards have reportedly already advised shareholders to vote against them -- an Alphabet union representative said her union might organize walkouts if Alphabet doesn't adopt the worker protection and civil and human rights reforms being presented to its board next month.

amazon, data protection news, shareholder activist demand reform, (5 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.85)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.30)

Add feedback

Facebook to be investigated over whether it is unfairly using personal data to push dating and shopping tools

The Independent - TechJun-4-2021, 12:08:18 GMT

Regulators have opened an investigation into Facebook amid concerns it is using its vast troves of personal data to push its own shopping and data tools. The probe by the UK's competition regulator will examine whether it is abusing its dominant position in online advertising. It comes amid growing antitrust concerns about the way many technology companies – not just Facebook but others such as Apple – have been able to use their vast size and hold on the market to unfairly benefit themselves. The Competition and Markets Authority (CMA) will look into how the social network gathers and uses certain data and whether it may provide an unfair advantage over rivals in the online classified ads and online dating space. As well as Facebook's advertising services, Facebook Login, a feature that allows people to sign into other websites and apps, will also form part of the probe.

facebook, investigation, personal data, (10 more...)

The Independent - Tech

Industry:

Information Technology > Services (1.00)
Education > Educational Setting > Online (0.58)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.40)

Add feedback

T-BERT -- Model for Sentiment Analysis of Micro-blogs Integrating Topic Model and BERT

Palani, Sarojadevi, Rajagopal, Prabhu, Pancholi, Sidharth

arXiv.org Artificial IntelligenceJun-2-2021

Sentiment analysis (SA) has become an extensive research area in recent years impacting diverse fields including ecommerce, consumer business, and politics, driven by increasing adoption and usage of social media platforms. It is challenging to extract topics and sentiments from unsupervised short texts emerging in such contexts, as they may contain figurative words, strident data, and co-existence of many possible meanings for a single word or phrase, all contributing to obtaining incorrect topics. Most prior research is based on a specific theme/rhetoric/focused-content on a clean dataset. In the work reported here, the effectiveness of BERT(Bidirectional Encoder Representations from Transformers) in sentiment classification tasks from a raw live dataset taken from a popular microblogging platform is demonstrated. A novel T-BERT framework is proposed to show the enhanced performance obtainable by combining latent topics with contextual BERT embeddings. Numerical experiments were conducted on an ensemble with about 42000 datasets using NimbleBox.ai platform with a hardware configuration consisting of Nvidia Tesla K80(CUDA), 4 core CPU, 15GB RAM running on an isolated Google Cloud Platform instance. The empirical results show that the model improves in performance while adding topics to BERT and an accuracy rate of 90.81% on sentiment classification using BERT with the proposed approach.

bert, information, sentiment analysis, (12 more...)

arXiv.org Artificial Intelligence

2106.01097

Country:

North America > United States (0.28)
Asia > Singapore (0.04)
Asia > Indonesia > Java > East Java > Surabaya (0.04)
Asia > India > NCT > Delhi (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Span Extraction Approach for Information Extraction on Visually-Rich Documents

Nguyen, Tuan-Anh D., Vu, Hieu M., Son, Nguyen Hong, Nguyen, Minh-Tien

arXiv.org Artificial IntelligenceJun-2-2021

Information extraction (IE) from visually-rich documents (VRDs) has achieved SOTA performance recently thanks to the adaptation of Transformer-based language models, which demonstrates great potential of pre-training methods. In this paper, we present a new approach to improve the capability of language model pre-training on VRDs. Firstly, we introduce a new IE model that is query-based and employs the span extraction formulation instead of the commonly used sequence labelling approach. Secondly, to further extend the span extraction formulation, we propose a new training task which focuses on modelling the relationships between semantic entities within a document. This task enables the spans to be extracted recursively and can be used as both a pre-training objective as well as an IE downstream task. Evaluation on various datasets of popular business documents (invoices, receipts) shows that our proposed method can improve the performance of existing models significantly, while providing a mechanism to accumulate model knowledge from multiple downstream IE tasks.

dataset, extraction, information extraction, (12 more...)

arXiv.org Artificial Intelligence

2106.00978

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Text Mining (0.64)

Add feedback

How to Create and Deploy a Simple Sentiment Analysis App via API - KDnuggets

#artificialintelligenceJun-1-2021, 17:06:45 GMT

Let's say you've built an NLP model for some specific task, whether it be text classification, question answering, translation, or what have you. You've tested it out locally and it performs well. You've had others test it out as well, and it continues to perform well. Now you want to roll it out to a larger audience, be that audience a team of developers you work with, a specific group of end users, or even the general public. You have decided that you want to do so using a REST API, as you find this to be your best option.

fastapi, pipeline, rest api, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.49)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.49)

Add feedback

Validating GAN-BioBERT: A Methodology For Assessing Reporting Trends In Clinical Trials

Myszewski, Joshua J, Klossowski, Emily, Meyer, Patrick, Bevil, Kristin, Klesius, Lisa, Schroeder, Kristopher M

arXiv.org Machine LearningJun-1-2021

In the past decade, there has been much discussion about the issue of biased reporting in clinical research. Despite this attention, there have been limited tools developed for the systematic assessment of qualitative statements made in clinical research, with most studies assessing qualitative statements relying on the use of manual expert raters, which limits their size. Also, previous attempts to develop larger scale tools, such as those using natural language processing, were limited by both their accuracy and the number of categories used for the classification of their findings. With these limitations in mind, this study's goal was to develop a classification algorithm that was both suitably accurate and finely grained to be applied on a large scale for assessing the qualitative sentiment expressed in clinical trial abstracts. Additionally, this study seeks to compare the performance of the proposed algorithm, GAN-BioBERT, to previous studies as well as to expert manual rating of clinical trial abstracts. This study develops a three-class sentiment classification algorithm for clinical trial abstracts using a semi-supervised natural language process model based on the Bidirectional Encoder Representation from Transformers (BERT) model, from a series of clinical trial abstracts annotated by a group of experts in academic medicine. Results: The use of this algorithm was found to have a classification accuracy of 91.3%, with a macro F1-Score of 0.92, which is a significant improvement in accuracy when compared to previous methods and expert ratings, while also making the sentiment classification finer grained than previous studies. The proposed algorithm, GAN-BioBERT, is a suitable classification model for the large-scale assessment of qualitative statements in clinical trial literature, providing an accurate, reproducible tool for the large-scale study of clinical publication trends.

algorithm, assessment, gan-biobert, (13 more...)

arXiv.org Machine Learning

2106.00665

Country: North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.72)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Correcting public opinion trends through Bayesian data assimilation

Hendrickx, Robin, Arcucci, Rossella, Lopez, Julio Amador Dıaz, Guo, Yi-Ke, Kennedy, Mark

arXiv.org Artificial IntelligenceMay-29-2021

Measuring public opinion is a key focus during democratic elections, enabling candidates to gauge their popularity and alter their campaign strategies accordingly. Traditional survey polling remains the most popular estimation technique, despite its cost and time intensity, measurement errors, lack of real-time capabilities and lagged representation of public opinion. In recent years, Twitter opinion mining has attempted to combat these issues. Despite achieving promising results, it experiences its own set of shortcomings such as an unrepresentative sample population and a lack of long term stability. This paper aims to merge data from both these techniques using Bayesian data assimilation to arrive at a more accurate estimate of true public opinion for the Brexit referendum. This paper demonstrates the effectiveness of the proposed approach using Twitter opinion data and survey data from trusted pollsters. Firstly, the possible existence of a time gap of 16 days between the two data sets is identified. This gap is subsequently incorporated into a proposed assimilation architecture. This method was found to adequately incorporate information from both sources and measure a strong upward trend in Leave support leading up to the Brexit referendum. The proposed technique provides useful estimates of true opinion, which is essential to future opinion measurement and forecasting research.

evolutionary algorithm, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2105.14276

Country:

North America > United States (0.68)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Africa > East Africa (0.04)

Genre:

Research Report (0.64)
Questionnaire & Opinion Survey (0.46)

Industry:

Government > Voting & Elections (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (0.56)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.46)

Add feedback

Sentiment analysis in tweets: an assessment study from classical to modern text representation models

Barreto, Sérgio, Moura, Ricardo, Carvalho, Jonnathan, Paes, Aline, Plastino, Alexandre

arXiv.org Artificial IntelligenceMay-29-2021

With the growth of social medias, such as Twitter, plenty of user-generated data emerge daily. The short texts published on Twitter -- the tweets -- have earned significant attention as a rich source of information to guide many decision-making processes. However, their inherent characteristics, such as the informal, and noisy linguistic style, remain challenging to many natural language processing (NLP) tasks, including sentiment analysis. Sentiment classification is tackled mainly by machine learning-based classifiers. The literature has adopted word representations from distinct natures to transform tweets to vector-based inputs to feed sentiment classifiers. The representations come from simple count-based methods, such as bag-of-words, to more sophisticated ones, such as BERTweet, built upon the trendy BERT architecture. Nevertheless, most studies mainly focus on evaluating those models using only a small number of datasets. Despite the progress made in recent years in language modelling, there is still a gap regarding a robust evaluation of induced embeddings applied to sentiment analysis on tweets. Furthermore, while fine-tuning the model from downstream tasks is prominent nowadays, less attention has been given to adjustments based on the specific linguistic style of the data. In this context, this study fulfils an assessment of existing language models in distinguishing the sentiment expressed in tweets by using a rich collection of 22 datasets from distinct domains and five classification algorithms. The evaluation includes static and contextualized representations. Contexts are assembled from Transformer-based autoencoder models that are also fine-tuned based on the masked language model task, using a plethora of strategies.

dataset, representation, tweet, (17 more...)

arXiv.org Artificial Intelligence

2105.14373

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Maryland > Baltimore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(11 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine (0.67)
Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Introduction to NLP with Disaster Tweets

#artificialintelligenceMay-28-2021, 10:25:09 GMT

Natural Language Processing, also known as NLP, is a subfield of computer science, specifically artificial intelligence, that focuses on understanding written and spoken text. It covers various tasks some of which are speech recognition, sentiment analysis and language generation; And, it has been applied in several use cases such as machine translation, spam detection, virtual assistants and chatbots. The project covered in this article is a sentiment analysis project called Natural Language Processing with Disaster Tweets. Sentiment analysis is the process to extract subjective qualities from text such as emotion or attitude. The objective of the project is to identify if a specific tweet is a real disaster or not. The project is ideal for beginners in NLP.

accuracy, disaster tweet, entry empty, (12 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)

Add feedback