AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Unleashing the Power of Emojis in Texts via Self-supervised Graph Pre-Training

Zhang, Zhou, Tan, Dongzeng, Wang, Jiaan, Chen, Yilong, Xu, Jiarong

arXiv.org Artificial IntelligenceSep-25-2024

Emojis have gained immense popularity on social platforms, serving as a common means to supplement or replace text. However, existing data mining approaches generally either completely ignore or simply treat emojis as ordinary Unicode characters, which may limit the model's ability to grasp the rich semantic information in emojis and the interaction between emojis and texts. Thus, it is necessary to release the emoji's power in social media data mining. To this end, we first construct a heterogeneous graph consisting of three types of nodes, i.e. post, word and emoji nodes to improve the representation of different elements in posts. The edges are also well-defined to model how these three elements interact with each other. To facilitate the sharing of information among post, word and emoji nodes, we propose a graph pre-train framework for text and emoji co-modeling, which contains two graph pre-training tasks: node-level graph contrastive learning and edge-level link reconstruction learning. Extensive experiments on the Xiaohongshu and Twitter datasets with two types of downstream tasks demonstrate that our approach proves significant improvement over previous strong baseline methods.

emoji, node, representation, (12 more...)

arXiv.org Artificial Intelligence

2409.14552

Country:

Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
(2 more...)

Add feedback

Enhancing Aspect-based Sentiment Analysis in Tourism Using Large Language Models and Positional Information

Xu, Chun, Wang, Mengmeng, Ren, Yan, Zhu, Shaolin

arXiv.org Artificial IntelligenceSep-23-2024

Aspect-Based Sentiment Analysis (ABSA) in tourism plays a significant role in understanding tourists' evaluations of specific aspects of attractions, which is crucial for driving innovation and development in the tourism industry. However, traditional pipeline models are afflicted by issues such as error propagation and incomplete extraction of sentiment elements. To alleviate this issue, this paper proposes an aspect-based sentiment analysis model, ACOS_LLM, for Aspect-Category-Opinion-Sentiment Quadruple Extraction (ACOSQE). The model comprises two key stages: auxiliary knowledge generation and ACOSQE. Firstly, Adalora is used to fine-tune large language models for generating high-quality auxiliary knowledge. To enhance model efficiency, Sparsegpt is utilized to compress the fine-tuned model to 50% sparsity. Subsequently, Positional information and sequence modeling are employed to achieve the ACOSQE task, with auxiliary knowledge and the original text as inputs. Experiments are conducted on both self-created tourism datasets and publicly available datasets, Rest15 and Rest16. Results demonstrate the model's superior performance, with an F1 improvement of 7.49% compared to other models on the tourism dataset. Additionally, there is an F1 improvement of 0.05% and 1.06% on the Rest15 and Rest16 datasets, respectively.

extraction, information, sentiment analysis, (13 more...)

arXiv.org Artificial Intelligence

2409.14997

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)
Asia > China > Liaoning Province > Dalian (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Opinion Mining on Offshore Wind Energy for Environmental Engineering

Bittencourt, Isabele, Varde, Aparna S., Lal, Pankaj

arXiv.org Artificial IntelligenceSep-21-2024

In this paper, we conduct sentiment analysis on social media data to study mass opinion about offshore wind energy. We adapt three machine learning models, namely, TextBlob, VADER, and SentiWordNet because different functions are provided by each model. TextBlob provides subjectivity analysis as well as polarity classification. VADER offers cumulative sentiment scores. SentiWordNet considers sentiments with reference to context and performs classification accordingly. Techniques in NLP are harnessed to gather meaning from the textual data in social media. Data visualization tools are suitably deployed to display the overall results. This work is much in line with citizen science and smart governance via involvement of mass opinion to guide decision support. It exemplifies the role of Machine Learning and NLP here.

artificial intelligence, natural language, offshore wind energy, (16 more...)

arXiv.org Artificial Intelligence

2409.14292

Country:

Europe > Germany (0.05)
North America > United States > New Jersey > Atlantic County > Atlantic City (0.04)
Europe > United Kingdom (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Cross-Target Stance Detection: A Survey of Techniques, Datasets, and Challenges

Khiabani, Parisa Jamadi, Zubiaga, Arkaitz

arXiv.org Artificial IntelligenceSep-20-2024

Stance detection is the task of determining the viewpoint expressed in a text towards a given target. A specific direction within the task focuses on cross-target stance detection, where a model trained on samples pertaining to certain targets is then applied to a new, unseen target. With the increasing need to analyze and mining viewpoints and opinions online, the task has recently seen a significant surge in interest. This review paper examines the advancements in cross-target stance detection over the last decade, highlighting the evolution from basic statistical methods to contemporary neural and LLM-based models. These advancements have led to notable improvements in accuracy and adaptability. Innovative approaches include the use of topic-grouped attention and adversarial learning for zero-shot detection, as well as fine-tuning techniques that enhance model robustness. Additionally, prompt-tuning methods and the integration of external knowledge have further refined model performance. A comprehensive overview of the datasets used for evaluating these models is also provided, offering valuable insights into the progress and challenges in the field. We conclude by highlighting emerging directions of research and by suggesting avenues for future work in the task.

dataset, detection, stance detection, (12 more...)

arXiv.org Artificial Intelligence

2409.13594

Country:

North America > United States (1.00)
Europe > France > Bourgogne-Franche-Comté > Doubs > Besançon (0.04)
Asia > Nepal (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Health & Medicine > Health Care Providers & Services (0.92)
Banking & Finance > Insurance (0.92)
(5 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
(3 more...)

Add feedback

Graph Neural Network Framework for Sentiment Analysis Using Syntactic Feature

Wu, Linxiao, Luo, Yuanshuai, Zhu, Binrong, Liu, Guiran, Wang, Rui, Yu, Qian

arXiv.org Artificial IntelligenceSep-20-2024

Amidst the swift evolution of social media platforms and e-commerce ecosystems, the domain of opinion mining has surged as a pivotal area of exploration within natural language processing. A specialized segment within this field focuses on extracting nuanced evaluations tied to particular elements within textual contexts. This research advances a composite framework that amalgamates the positional cues of topical descriptors. The proposed system converts syntactic structures into a matrix format, leveraging convolutions and attention mechanisms within a graph to distill salient characteristics. Incorporating the positional relevance of descriptors relative to lexical items enhances the sequential integrity of the input. Trials have substantiated that this integrated graph-centric scheme markedly elevates the efficacy of evaluative categorization, showcasing preeminence.

dataset, neural network, sentiment analysis, (10 more...)

arXiv.org Artificial Intelligence

2409.14

Country:

North America > United States > California > San Francisco County > San Francisco (0.05)
North America > United States > New York (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report (0.82)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.75)

Add feedback

Lexicon-Based Sentiment Analysis on Text Polarities with Evaluation of Classification Models

Raees, Muhammad, Fazilat, Samina

arXiv.org Artificial IntelligenceSep-19-2024

Sentiment analysis possesses the potential of diverse applicability on digital platforms. Sentiment analysis extracts the polarity to understand the intensity and subjectivity in the text. This work uses a lexicon-based method to perform sentiment analysis and shows an evaluation of classification models trained over textual data. The lexicon-based methods identify the intensity of emotion and subjectivity at word levels. The categorization identifies the informative words inside a text and specifies the quantitative ranking of the polarity of words. This work is based on a multi-class problem of text being labeled as positive, negative, or neutral. Twitter sentiment dataset containing 1.6 million unprocessed tweets is used with lexicon-based methods like Text Blob and Vader Sentiment to introduce the neutrality measure on text. The analysis of lexicons shows how the word count and the intensity classify the text. A comparative analysis of machine learning models, Naiive Bayes, Support Vector Machines, Multinomial Logistic Regression, Random Forest, and Extreme Gradient (XG) Boost performed across multiple performance metrics. The best estimations are achieved through Random Forest with an accuracy score of 81%. Additionally, sentiment analysis is applied for a personality judgment case against a Twitter profile based on online activity.

classification model, lexicon-based sentiment analysis, text polarity, (1 more...)

arXiv.org Artificial Intelligence

2409.1284

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)

Add feedback

Mpox Narrative on Instagram: A Labeled Multilingual Dataset of Instagram Posts on Mpox for Sentiment, Hate Speech, and Anxiety Analysis

Thakur, Nirmalya

arXiv.org Artificial IntelligenceSep-18-2024

The world is currently experiencing an outbreak of mpox, which has been declared a Public Health Emergency of International Concern by WHO. No prior work related to social media mining has focused on the development of a dataset of Instagram posts about the mpox outbreak. The work presented in this paper aims to address this research gap and makes two scientific contributions to this field. First, it presents a multilingual dataset of 60,127 Instagram posts about mpox, published between July 23, 2022, and September 5, 2024. The dataset, available at https://dx.doi.org/10.21227/7fvc-y093, contains Instagram posts about mpox in 52 languages. For each of these posts, the Post ID, Post Description, Date of publication, language, and translated version of the post (translation to English was performed using the Google Translate API) are presented as separate attributes in the dataset. After developing this dataset, sentiment analysis, hate speech detection, and anxiety or stress detection were performed. This process included classifying each post into (i) one of the sentiment classes, i.e., fear, surprise, joy, sadness, anger, disgust, or neutral, (ii) hate or not hate, and (iii) anxiety/stress detected or no anxiety/stress detected. These results are presented as separate attributes in the dataset. Second, this paper presents the results of performing sentiment analysis, hate speech analysis, and anxiety or stress analysis. The variation of the sentiment classes - fear, surprise, joy, sadness, anger, disgust, and neutral were observed to be 27.95%, 2.57%, 8.69%, 5.94%, 2.69%, 1.53%, and 50.64%, respectively. In terms of hate speech detection, 95.75% of the posts did not contain hate and the remaining 4.25% of the posts contained hate. Finally, 72.05% of the posts did not indicate any anxiety/stress, and the remaining 27.95% of the posts represented some form of anxiety/stress.

dataset, detection, outbreak, (12 more...)

arXiv.org Artificial Intelligence

2409.05292

Country:

Africa > Democratic Republic of the Congo (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
South America > Brazil (0.04)
(15 more...)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.89)

Add feedback

RUIE: Retrieval-based Unified Information Extraction using Large Language Model

Liao, Xincheng, Duan, Junwen, Huang, Yixi, Wang, Jianxin

arXiv.org Artificial IntelligenceSep-17-2024

Unified information extraction (UIE) aims to complete all information extraction tasks using a single model or framework. While previous work has primarily focused on instruction-tuning large language models (LLMs) with constructed datasets, these methods require significant computational resources and struggle to generalize to unseen tasks. To address these limitations, we propose RUIE (Retrieval-based Unified Information Extraction), a framework that leverages in-context learning to enable rapid generalization while reducing computational costs. The key challenge in RUIE is selecting the most beneficial demonstrations for LLMs to effectively handle diverse IE tasks. To achieve this, we integrate LLM preferences for ranking candidate demonstrations and design a keyword-enhanced reward model to capture fine-grained relationships between queries and demonstrations. We then train a bi-encoder retriever for UIE through contrastive learning and knowledge distillation. To the best of our knowledge, RUIE is the first trainable retrieval framework for UIE. Experimental results on 8 held-out datasets demonstrate RUIE's effectiveness in generalizing to unseen tasks, with average F1-score improvements of 19.22 and 3.13 compared to instruction-tuning methods and other retrievers, respectively. Further analysis confirms RUIE's adaptability to LLMs of varying sizes and the importance of its key components.

computational linguistic, dataset, llm, (12 more...)

arXiv.org Artificial Intelligence

2409.11673

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York > New York County > New York City (0.04)
(11 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Cross-Lingual News Event Correlation for Stock Market Trend Prediction

Arshad, Sahar, Azhar, Nikhar, Sajid, Sana, Latif, Seemab, Latif, Rabia

arXiv.org Artificial IntelligenceSep-16-2024

In the modern economic landscape, integrating financial services with Financial Technology (FinTech) has become essential, particularly in stock trend analysis. This study addresses the gap in comprehending financial dynamics across diverse global economies by creating a structured financial dataset and proposing a cross-lingual Natural Language-based Financial Forecasting (NLFF) pipeline for comprehensive financial analysis. Utilizing sentiment analysis, Named Entity Recognition (NER), and semantic textual similarity, we conducted an analytical examination of news articles to extract, map, and visualize financial event timelines, uncovering the correlation between news events and stock market trends. Our method demonstrated a meaningful correlation between stock price movements and cross-linguistic news sentiments, validated by processing two-year cross-lingual news data on two prominent sectors of the Pakistan Stock Exchange. This study offers significant insights into key events, ensuring a substantial decision margin for investors through effective visualization and providing optimal investment opportunities.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2410.00024

Country:

North America > United States (0.14)
Asia > Pakistan > Islamabad Capital Territory > Islamabad (0.04)
Europe > Switzerland (0.04)
(7 more...)

Genre: Research Report > New Finding (0.93)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comprehensive Study on Sentiment Analysis: From Rule-based to modern LLM based system

Gupta, Shailja, Ranjan, Rajesh, Singh, Surya Narayan

arXiv.org Artificial IntelligenceSep-16-2024

This paper provides a comprehensive survey of sentiment analysis within the context of artificial intelligence (AI) and large language models (LLMs). Sentiment analysis, a critical aspect of natural language processing (NLP), has evolved significantly from traditional rule-based methods to advanced deep learning techniques. This study examines the historical development of sentiment analysis, highlighting the transition from lexicon-based and pattern-based approaches to more sophisticated machine learning and deep learning models. Key challenges are discussed, including handling bilingual texts, detecting sarcasm, and addressing biases. The paper reviews state-of-the-art approaches, identifies emerging trends, and outlines future research directions to advance the field. By synthesizing current methodologies and exploring future opportunities, this survey aims to understand sentiment analysis in the AI and LLM context thoroughly.

expression, proceedings, sentiment analysis, (11 more...)

arXiv.org Artificial Intelligence

2409.09989

Country:

North America > United States > New Jersey (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India > Telangana > Hyderabad (0.04)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.48)
Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback