AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Combining Deep Neural Reranking and Unsupervised Extraction for Multi-Query Focused Summarization

Seeberger, Philipp, Riedhammer, Korbinian

arXiv.org Artificial IntelligenceFeb-2-2023

The CrisisFACTS Track aims to tackle challenges such as multi-stream fact-finding in the domain of event tracking; participants' systems extract important facts from several disaster-related events while incorporating the temporal order. We propose a combination of retrieval, reranking, and the well-known Integer Linear Programming (ILP) and Maximal Marginal Relevance (MMR) frameworks. In the former two modules, we explore various methods including an entity-based baseline, pre-trained and fine-tuned Question Answering systems, and ColBERT. We then use the latter module as an extractive summarization component by taking diversity and novelty criteria into account. The automatic scoring runs show strong results across the evaluation setups but also reveal shortcomings and challenges.

machine learning, natural language, question answering, (13 more...)

arXiv.org Artificial Intelligence

2302.01148

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.14)
North America > United States > Washington > King County > Seattle (0.04)
(12 more...)

Genre: Research Report (0.64)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.50)

Add feedback

Top 10 Applications of Sentiment Analysis in Business

#artificialintelligenceFeb-1-2023, 08:31:01 GMT

We are all aware of the Internet's explosive expansion as a primary source of information and a platform for opinion expression. It has now become essential to gather and analyze the ever-expanding data that follows. While in the past, manual analysis of data has been possible and even served us well, the same cannot be said true for this digital era. Let us say a large chunk of data has to be manually analyzed. Can you do the math involving time and resources associated with it?

artificial intelligence, natural language, sentiment analysis, (14 more...)

#artificialintelligence

Industry:

Banking & Finance (1.00)
Information Technology (0.95)
Government > Voting & Elections (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.76)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.61)

Add feedback

Automated Sentiment and Hate Speech Analysis of Facebook Data by Employing Multilingual Transformer Models

Manuvie, Ritumbra, Chatterjee, Saikat

arXiv.org Artificial IntelligenceJan-31-2023

In recent years, there has been a heightened consensus - both within academia and in the public discourse - that Social Media Platforms (SMPs), amplify the spread of hateful and negative sentiment content. Researchers have identified how hateful content, political propaganda, and targeted messaging contributed to real-world harms including insurrections against democratically elected governments, genocide, and breakdown of social cohesion due to heightened negative discourse towards certain communities in parts of the world. To counter these issues, SMPs have created semi-automated systems that can help identify toxic speech. In this paper we analyse the statistical distribution of hateful and negative sentiment contents within a representative Facebook dataset (n= 604,703) scrapped through 648 public Facebook pages which identify themselves as proponents (and followers) of far-right Hindutva actors. These pages were identified manually using keyword searches on Facebook and on CrowdTangleand classified as far-right Hindutva pages based on page names, page descriptions, and discourses shared on these pages. We employ state-of-the-art, open-source XLM-T multilingual transformer-based language models to perform sentiment and hate speech analysis of the textual contents shared on these pages over a period of 5.5 years. The result shows the statistical distributions of the predicted sentiment and the hate speech labels; top actors, and top page categories. We further discuss the benchmark performances and limitations of these pre-trained language models.

information retrieval, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2301.13668

Country:

Asia > India (0.05)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Europe > Netherlands > South Holland > The Hague (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Media (1.00)
Information Technology > Services (1.00)
Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.54)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.38)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

EDSA-Ensemble: an Event Detection Sentiment Analysis Ensemble Architecture

Petrescu, Alexandru, Truică, Ciprian-Octavian, Apostol, Elena-Simona, Paschke, Adrian

arXiv.org Artificial IntelligenceJan-30-2023

As social media platforms grow more and more each day, it also increases the need to analyze and understand certain aspects, such as the impact of important or spiking topics over the network[49]. Event Detection techniques are used to automatically identify important or spiking topics by analysing social media data. In this paper, we use the angle of the positive emotion generated by these topics for the users and the magnitude, both reach and time span, in order to better understand what is happening on social media platforms, mainly Twitter. Sentiment Analysis is a field in Natural Language Processing that analyzes user opinions and emotions from written language [38, 66], while Event Detection deals with analyzing information diffusion in graph networks [24]. Although there is a large volume of work done on Event Detection using social media data and on Sentiment Analysis of this type of content, in the current literature, there is a shortcoming of the approaches that combine the two domains. There are multiple communities that are involved in mining, gathering, and giving some meaning to the vast amount of content generated daily by the users of those platforms, namely the Network Analysis and Natural Language Processing communities. The two communities are using different types of approaches since they have different purposes: For the Network Analysis community, the main purpose is developing methods to deal with the spread and mitigation of harmful content using Event Detection. Event Detection is used to detect the impact and spread of topics on Social Networks using multiple types of approaches such as sliding windows, topic detection, etc.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2301.12805

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
Europe > Germany > Berlin (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(6 more...)

Add feedback

Improving the Modality Representation with Multi-View Contrastive Learning for Multimodal Sentiment Analysis

Liu, Peipei, Zheng, Xin, Li, Hong, Liu, Jie, Ren, Yimo, Zhu, Hongsong, Sun, Limin

arXiv.org Artificial IntelligenceJan-28-2023

Modality representation learning is an important problem for multimodal sentiment analysis (MSA), since the highly distinguishable representations can contribute to improving the analysis effect. Previous works of MSA have usually focused on multimodal fusion strategies, and the deep study of modal representation learning was given less attention. Recently, contrastive learning has been confirmed effective at endowing the learned representation with stronger discriminate ability. Inspired by this, we explore the improvement approaches of modality representation with contrastive learning in this study. To this end, we devise a three-stages framework with multi-view contrastive learning to refine representations for the specific objectives. At the first stage, for the improvement of unimodal representations, we employ the supervised contrastive learning to pull samples within the same class together while the other samples are pushed apart. At the second stage, a self-supervised contrastive learning is designed for the improvement of the distilled unimodal representations after cross-modal interaction. At last, we leverage again the supervised contrastive learning to enhance the fused multimodal representation. After all the contrast trainings, we next achieve the classification task based on frozen representations. We conduct experiments on three open datasets, and results show the advance of our model.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

2210.15824

Country:

North America > United States > New York > New York County > New York City (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.73)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)

Add feedback

Presence of informal language, such as emoticons, hashtags, and slang, impact the performance of sentiment analysis models on social media text?

Ganie, Aadil Gani

arXiv.org Artificial IntelligenceJan-28-2023

This study aimed to investigate the influence of the presence of informal language, such as emoticons and slang, on the performance of sentiment analysis models applied to social media text. A convolutional neural network (CNN) model was developed and trained on three datasets: a sarcasm dataset, a sentiment dataset, and an emoticon dataset. The model architecture was held constant for all experiments and the model was trained on 80% of the data and tested on 20%. The results revealed that the model achieved an accuracy of 96.47% on the sarcasm dataset, with the lowest accuracy for class 1. On the sentiment dataset, the model achieved an accuracy of 95.28%. The amalgamation of sarcasm and sentiment datasets improved the accuracy of the model to 95.1%, and the addition of emoticon dataset has a slight positive impact on the accuracy of the model to 95.37%. The study suggests that the presence of informal language has a restricted impact on the performance of sentiment analysis models applied to social media text. However, the inclusion of emoticon data to the model can enhance the accuracy slightly.

machine learning, natural language, sentiment analysis model, (15 more...)

arXiv.org Artificial Intelligence

2301.12303

Country: Europe > Hungary > Borsod-Abaúj-Zemplén County > Miskolc (0.05)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.51)
Information Technology > Services (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and Politicised Hate Speech

Govers, Jarod, Feldman, Philip, Dant, Aaron, Patros, Panos

arXiv.org Artificial IntelligenceJan-27-2023

Social media is a modern person's digital voice to project and engage with new ideas and mobilise communities $\unicode{x2013}$ a power shared with extremists. Given the societal risks of unvetted content-moderating algorithms for Extremism, Radicalisation, and Hate speech (ERH) detection, responsible software engineering must understand the who, what, when, where, and why such models are necessary to protect user safety and free expression. Hence, we propose and examine the unique research field of ERH context mining to unify disjoint studies. Specifically, we evaluate the start-to-finish design process from socio-technical definition-building and dataset collection strategies to technical algorithm design and performance. Our 2015-2021 51-study Systematic Literature Review (SLR) provides the first cross-examination of textual, network, and visual approaches to detecting extremist affiliation, hateful content, and radicalisation towards groups and movements. We identify consensus-driven ERH definitions and propose solutions to existing ideological and geographic biases, particularly due to the lack of research in Oceania/Australasia. Our hybridised investigation on Natural Language Processing, Community Detection, and visual-text models demonstrates the dominating performance of textual transformer-based algorithms. We conclude with vital recommendations for ERH context mining researchers and propose an uptake roadmap with guidelines for researchers, industries, and governments to enable a safer cyberspace.

data mining, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2301.11579

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > New York > New York County > New York City (0.14)
(31 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Terrorism (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(7 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(5 more...)

Add feedback

Jointly Identifying and Fixing Inconsistent Readings from Information Extraction Systems

Padia, Ankur, Ferraro, Francis, Finin, Tim

arXiv.org Artificial IntelligenceJan-26-2023

KGCleaner is a framework to identify and correct errors in data produced and delivered by an information extraction system. These tasks have been understudied and KGCleaner is the first to address both. We introduce a multi-task model that jointly learns to predict if an extracted relation is credible and repair it if not. We evaluate our approach and other models as instance of our framework on two collections: a Wikidata corpus of nearly 700K facts and 5M fact-relevant sentences and a collection of 30K facts from the 2015 TAC Knowledge Base Population task. For credibility classification, parameter efficient simple shallow neural network can achieve an absolute performance gain of 30 $F_1$ points on Wikidata and comparable performance on TAC. For the repair task, significant performance (at more than twice) gain can be obtained depending on the nature of the dataset and the models.

machine learning, natural language, provenance sentence, (20 more...)

arXiv.org Artificial Intelligence

1808.04816

Country:

Africa > Kenya (0.14)
North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)
(5 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.94)
Health & Medicine (0.68)
Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Review of Natural Language Processing in Pharmacology

Trajanov, Dimitar, Trajkovski, Vangel, Dimitrieva, Makedonka, Dobreva, Jovana, Jovanovik, Milos, Klemen, Matej, Žagar, Aleš, Robnik-Šikonja, Marko

arXiv.org Artificial IntelligenceJan-26-2023

Natural language processing (NLP) is an area of artificial intelligence that applies information technologies to process the human language, understand it to a certain degree, and use it in various applications. This area has rapidly developed in the last few years and now employs modern variants of deep neural networks to extract relevant patterns from large text corpora. The main objective of this work is to survey the recent use of NLP in the field of pharmacology. As our work shows, NLP is a highly relevant information extraction and processing approach for pharmacology. It has been used extensively, from intelligent searches through thousands of medical documents to finding traces of adversarial drug interactions in social media. We split our coverage into five categories to survey modern NLP methodology, commonly addressed tasks, relevant textual data, knowledge bases, and useful programming libraries. We split each of the five categories into appropriate subcategories, describe their main properties and ideas, and summarize them in a tabular form. The resulting survey presents a comprehensive overview of the area, useful to practitioners and interested observers.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2208.10228

Country:

Europe > North Macedonia > Skopje Statistical Region > Skopje Municipality > Skopje (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(12 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.93)
Research Report > Experimental Study (0.92)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(3 more...)

Add feedback

VAuLT: Augmenting the Vision-and-Language Transformer for Sentiment Classification on Social Media

Chochlakis, Georgios, Srinivasan, Tejas, Thomason, Jesse, Narayanan, Shrikanth

arXiv.org Artificial IntelligenceJan-25-2023

We propose the Vision-and-Augmented-Language Transformer (VAuLT). VAuLT is an extension of the popular Vision-and-Language Transformer (ViLT), and improves performance on vision-and-language (VL) tasks that involve more complex text inputs than image captions while having minimal impact on training and inference efficiency. ViLT, importantly, enables efficient training and inference in VL tasks, achieved by encoding images using a linear projection of patches instead of an object detector. However, it is pretrained on captioning datasets, where the language input is simple, literal, and descriptive, therefore lacking linguistic diversity. So, when working with multimedia data in the wild, such as multimodal social media data, there is a notable shift from captioning language data, as well as diversity of tasks. We indeed find evidence that the language capacity of ViLT is lacking. The key insight and novelty of VAuLT is to propagate the output representations of a large language model (LM) like BERT to the language input of ViLT. We show that joint training of the LM and ViLT can yield relative improvements up to 20% over ViLT and achieve state-of-the-art or comparable performance on VL tasks involving richer language inputs and affective constructs, such as for Target-Oriented Sentiment Classification in TWITTER-2015 and TWITTER-2017, and Sentiment Classification in MVSA-Single and MVSA-Multiple. Our code is available at https://github.com/gchochla/VAuLT.

machine learning, natural language, text classification, (18 more...)

arXiv.org Artificial Intelligence

2208.09021

Country: North America > United States > California > Los Angeles County > Los Angeles (0.29)

Genre: Research Report (0.82)

Industry:

Information Technology (0.46)
Government (0.46)
Media (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)
(2 more...)

Add feedback