AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Sparse Fuzzy Attention for Structured Sentiment Analysis

arXiv.org Artificial IntelligenceSep-24-2021

Attention scorers have achieved success in parsing tasks like semantic and syntactic dependency parsing. However, in tasks modeled into parsing, like structured sentiment analysis, "dependency edges" are very sparse which hinders parser performance. Thus we propose a sparse and fuzzy attention scorer with pooling layers which improves parser performance and sets the new state-of-the-art on structured sentiment analysis. We further explore the parsing modeling on structured sentiment analysis with second-order parsing and introduce a novel sparse second-order edge building procedure that leads to significant improvement in parsing performance.

computational linguistic, dependency, sentiment analysis, (15 more...)

arXiv.org Artificial Intelligence

2109.06719

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Shanghai > Shanghai (0.05)
Europe > Italy > Tuscany > Florence (0.05)
(10 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Zero-Shot Information Extraction as a Unified Text-to-Triple Translation

Wang, Chenguang, Liu, Xiao, Chen, Zui, Hong, Haoyun, Tang, Jie, Song, Dawn

arXiv.org Artificial IntelligenceSep-23-2021

We cast a suite of information extraction tasks into a text-to-triple translation framework. Instead of solving each task relying on task-specific datasets and models, we formalize the task as a translation between task-specific input text and output triples. By taking the task-specific input, we enable a task-agnostic translation by leveraging the latent knowledge that a pre-trained language model has about the task. We further demonstrate that a simple pre-training task of predicting which relational information corresponds to which input text is an effective way to produce task-specific outputs. This enables the zero-shot transfer of our framework to downstream tasks. We study the zero-shot performance of this framework on open information extraction (OIE2016, NYT, WEB, PENN), relation classification (FewRel and TACRED), and factual probe (Google-RE and T-REx). The model transfers non-trivially to most tasks and is often competitive with a fully supervised method without the need for any task-specific training. For instance, we significantly outperform the F1 score of the supervised open information extraction without needing to use its training set.

dataset, extraction, information extraction, (15 more...)

arXiv.org Artificial Intelligence

2109.11171

Country:

North America > United States > Rhode Island (0.04)
North America > United States > New York (0.04)
North America > United States > Indiana > Hamilton County > Fishers (0.04)
(5 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Media (0.67)
Government (0.67)
Leisure & Entertainment > Sports > Horse Racing (0.46)

Technology:

Information Technology > Data Science > Data Mining > Text Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

What is sentiment analysis? Using NLP and ML to extract meaning

#artificialintelligenceSep-18-2021, 15:30:35 GMT

Sentiment analysis is analytical technique that uses statistics, natural language processing, and machine learning to determine the emotional meaning of communications. Companies use sentiment analysis to evaluate customer messages, call center interactions, online reviews, social media posts, and other content. Sentiment analysis can track changes in attitudes towards companies, products, or services, or individual features of those products or services. Get the latest insights with our CIO Daily newsletter. One of the most prominent examples of sentiment analysis on the Web today is the Hedonometer, a project of the University of Vermont's Computational Story Lab.

sentiment analysis, sutherland, training data, (12 more...)

#artificialintelligence

Country: North America > United States > Vermont (0.25)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.72)
Information Technology > Services (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

A Comprehensive Overview of Recommender System and Sentiment Analysis

AL-Ghuribi, Sumaia Mohammed, Noah, Shahrul Azman Mohd

arXiv.org Artificial IntelligenceSep-17-2021

Recommender system has been proven to be significantly crucial in many fields and is widely used by various domains. Most of the conventional recommender systems rely on the numeric rating given by a user to reflect his opinion about a consumed item; however, these ratings are not available in many domains. As a result, a new source of information represented by the user-generated reviews is incorporated in the recommendation process to compensate for the lack of these ratings. The reviews contain prosperous and numerous information related to the whole item or a specific feature that can be extracted using the sentiment analysis field. This paper gives a comprehensive overview to help researchers who aim to work with recommender system and sentiment analysis. It includes a background of the recommender system concept, including phases, approaches, and performance metrics used in recommender systems. Then, it discusses the sentiment analysis concept and highlights the main points in the sentiment analysis, including level, approaches, and focuses on aspect-based sentiment analysis.

recommendation, recommender system, sentiment analysis, (13 more...)

arXiv.org Artificial Intelligence

2109.08794

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > Malaysia (0.04)
South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
(4 more...)

Genre: Overview (1.00)

Industry:

Consumer Products & Services (1.00)
Media > Film (0.68)
Leisure & Entertainment (0.67)
Information Technology > Services > e-Commerce Services (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(5 more...)

Add feedback

SentiPrompt: Sentiment Knowledge Enhanced Prompt-Tuning for Aspect-Based Sentiment Analysis

Li, Chengxi, Gao, Feiyu, Bu, Jiajun, Xu, Lu, Chen, Xiang, Gu, Yu, Shao, Zirui, Zheng, Qi, Zhang, Ningyu, Wang, Yongpan, Yu, Zhi

arXiv.org Artificial IntelligenceSep-16-2021

Aspect-based sentiment analysis (ABSA) is an emerging fine-grained sentiment analysis task that aims to extract aspects, classify corresponding sentiment polarities and find opinions as the causes of sentiment. The latest research tends to solve the ABSA task in a unified way with end-to-end frameworks. Yet, these frameworks get fine-tuned from downstream tasks without any task-adaptive modification. Specifically, they do not use task-related knowledge well or explicitly model relations between aspect and opinion terms, hindering them from better performance. In this paper, we propose SentiPrompt to use sentiment knowledge enhanced prompts to tune the language model in the unified framework. We inject sentiment knowledge regarding aspects, opinions, and polarities into prompt and explicitly model term relations via constructing consistency and polarity judgment templates from the ground truth triplets. Experimental results demonstrate that our approach can outperform strong baselines on Triplet Extraction, Pair Extraction, and Aspect Term Extraction with Sentiment Classification by a notable margin.

computational linguistic, dataset, knowledge, (15 more...)

arXiv.org Artificial Intelligence

2109.08306

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Zhejiang Province > Ningbo (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

An Ontology-Based Information Extraction System for Residential Land Use Suitability Analysis

Al-Ageili, Munira, Mouhoub, Malek

arXiv.org Artificial IntelligenceSep-15-2021

We propose an Ontology-Based Information Extraction (OBIE) system to automate the extraction of the criteria and values applied in Land Use Suitability Analysis (LUSA) from bylaw and regulation documents related to the geographic area of interest. The results obtained by our proposed LUSA OBIE system (land use suitability criteria and their values) are presented as an ontology populated with instances of the extracted criteria and property values. This latter output ontology is incorporated into a Multi-Criteria Decision Making (MCDM) model applied for constructing suitability maps for different kinds of land uses. The resulting maps may be the final desired product or can be incorporated into the cellular automata urban modeling and simulation for predicting future urban growth. A case study has been conducted where the output from LUSA OBIE is applied to help produce a suitability map for the City of Regina, Saskatchewan, to assist in the identification of suitable areas for residential development. A set of Saskatchewan bylaw and regulation documents were downloaded and input to the LUSA OBIE system. We accessed the extracted information using both the populated LUSA ontology and the set of annotated documents. In this regard, the LUSA OBIE system was effective in producing a final suitability map.

annotation, information, ontology, (14 more...)

arXiv.org Artificial Intelligence

2109.07672

Country:

North America > Canada > Saskatchewan > Regina (0.34)
North America > United States > Maine (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Law > Real Estate Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)

Add feedback

Dialog speech sentiment classification for imbalanced datasets

Nicolaou, Sergis, Mavrides, Lambros, Tryfou, Georgina, Tolias, Kyriakos, Panousis, Konstantinos, Chatzis, Sotirios, Theodoridis, Sergios

arXiv.org Artificial IntelligenceSep-15-2021

Speech is the most common way humans express their feelings, and sentiment analysis is the use of tools such as natural language processing and computational algorithms to identify the polarity of these feelings. Even though this field has seen tremendous advancements in the last two decades, the task of effectively detecting under represented sentiments in different kinds of datasets is still a challenging task. In this paper, we use single and bi-modal analysis of short dialog utterances and gain insights on the main factors that aid in sentiment detection, particularly in the underrepresented classes, in datasets with and without inherent sentiment component. Furthermore, we propose an architecture which uses a learning rate scheduler and different monitoring criteria and provides state-of-the-art results for the SWITCHBOARD imbalanced sentiment dataset.

classification, dataset, sentiment analysis, (14 more...)

arXiv.org Artificial Intelligence

2109.07228

Country:

Oceania > Australia (0.04)
Europe > Middle East > Cyprus > Limassol > Limassol (0.04)
Europe > Denmark > North Jutland > Aalborg (0.04)
Asia > India (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.79)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.79)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.66)

Add feedback

What are the attackers doing now? Automating cyber threat intelligence extraction from text on pace with the changing threat landscape: A survey

Rahman, Md Rayhanur, Mahdavi-Hezaveh, Rezvan, Williams, Laurie

arXiv.org Artificial IntelligenceSep-14-2021

Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles, where cyberattack strategies, procedures, and tools are described. The goal of this article is to aid cybersecurity researchers understand the current techniques used for cyberthreat intelligence extraction from text through a survey of relevant studies in the literature. We systematically collect "CTI extraction from text"-related studies from the literature and categorize the CTI extraction purposes. We propose a CTI extraction pipeline abstracted from these studies. We identify the data sources, techniques, and CTI sharing formats utilized in the context of the proposed pipeline. Our work finds ten types of extraction purposes, such as extraction indicators of compromise extraction, TTPs (tactics, techniques, procedures of attack), and cybersecurity keywords. We also identify seven types of textual sources for CTI extraction, and textual data obtained from hacker forums, threat reports, social media posts, and online news articles have been used by almost 90% of the studies. Natural language processing along with both supervised and unsupervised machine learning techniques such as named entity recognition, topic modelling, dependency parsing, supervised classification, and clustering are used for CTI extraction. We observe the technical challenges associated with these studies related to obtaining available clean, labelled data which could assure replication, validation, and further extension of the studies. As we find the studies focusing on CTI information extraction from text, we advocate for building upon the current CTI extraction work to help cybersecurity practitioners with proactive decision making such as threat prioritization, automated threat modelling to utilize knowledge from past cybersecurity incidents.

information retrieval, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3571726

2109.06808

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.14)
North America > United States > Utah (0.04)
North America > United States > Virginia (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
(3 more...)

Add feedback

BenchIE: Open Information Extraction Evaluation Based on Facts, Not Tokens

Gashteovski, Kiril, Yu, Mingying, Kotnis, Bhushan, Lawrence, Carolin, Glavas, Goran, Niepert, Mathias

arXiv.org Artificial IntelligenceSep-14-2021

Intrinsic evaluations of OIE systems are carried out either manually -- with human evaluators judging the correctness of extractions -- or automatically, on standardized benchmarks. The latter, while much more cost-effective, is less reliable, primarily because of the incompleteness of the existing OIE benchmarks: the ground truth extractions do not include all acceptable variants of the same fact, leading to unreliable assessment of models' performance. Moreover, the existing OIE benchmarks are available for English only. In this work, we introduce BenchIE: a benchmark and evaluation framework for comprehensive evaluation of OIE systems for English, Chinese and German. In contrast to existing OIE benchmarks, BenchIE takes into account informational equivalence of extractions: our gold standard consists of fact synsets, clusters in which we exhaustively list all surface forms of the same fact. We benchmark several state-of-the-art OIE systems using BenchIE and demonstrate that these systems are significantly less effective than indicated by existing OIE benchmarks. We make BenchIE (data and evaluation code) publicly available.

data mining, extraction, natural language, (18 more...)

arXiv.org Artificial Intelligence

2109.0685

Country:

Oceania > Australia (0.05)
North America > United States > Illinois > Cook County > Chicago (0.05)
Africa > Kenya (0.05)
(5 more...)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area (0.48)
Government (0.47)
Leisure & Entertainment > Sports > Basketball (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.67)
Information Technology > Data Science > Data Mining > Text Mining (0.43)

Add feedback

Not All Negatives are Equal: Label-Aware Contrastive Loss for Fine-grained Text Classification

Suresh, Varsha, Ong, Desmond C.

arXiv.org Artificial IntelligenceSep-12-2021

Fine-grained classification involves dealing with datasets with larger number of classes with subtle differences between them. Guiding the model to focus on differentiating dimensions between these commonly confusable classes is key to improving performance on fine-grained tasks. In this work, we analyse the contrastive fine-tuning of pre-trained language models on two fine-grained text classification tasks, emotion classification and sentiment analysis. We adaptively embed class relationships into a contrastive objective function to help differently weigh the positives and negatives, and in particular, weighting closely confusable negatives more than less similar negative examples. We find that Label-aware Contrastive Loss outperforms previous contrastive methods, in the presence of larger number and/or more confusable classes, and helps models to produce output distributions that are more differentiated.

classification, computational linguistic, proceedings, (14 more...)

arXiv.org Artificial Intelligence

2109.05427

Country:

Asia > Singapore (0.04)
Asia > China > Hong Kong (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(6 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.71)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback