AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

Multimodal Sentiment Analysis Based on Causal Reasoning

Chen, Fuhai, Huang, Pengpeng, Ge, Xuri, Huang, Jie, Bao, Zishuo

arXiv.org Artificial IntelligenceDec-10-2024

With the rapid development of multimedia, the shift from unimodal textual sentiment analysis to multimodal image-text sentiment analysis has obtained academic and industrial attention in recent years. However, multimodal sentiment analysis is affected by unimodal data bias, e.g., text sentiment is misleading due to explicit sentiment semantic, leading to low accuracy in the final sentiment classification. In this paper, we propose a novel C ounterFactual Multimodal Sentiment A nalysis framework (CF-MSA) using causal counterfactual inference to construct multimodal sentiment causal inference. CF-MSA mitigates the direct effect from unimodal bias and ensures heterogeneity across modalities by differentiating the treatment variables between modalities. In addition, considering the information complementarity and bias differences between modalities, we propose a new optimisation objective to effectively integrate different modalities and reduce the inherent bias from each modality. Experimental results on two public datasets, MVSA-Single and MVSA-Multiple, demonstrate that the proposed CF-MSA has superior debiasing capability and achieves new state-of-the-art performances. We will release the code and datasets to facilitate future research. Sentiment analysis has been the fundamental research in the field of artificial intelligence.

multimodal sentiment analysis, prediction, sentiment analysis, (17 more...)

arXiv.org Artificial Intelligence

2412.07292

Country:

North America > Canada > Quebec > Montreal (0.04)
Asia > Singapore > Central Region > Singapore (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Bridging the Gap for Test-Time Multimodal Sentiment Analysis

Guo, Zirun, Jin, Tao, Xu, Wenlong, Lin, Wang, Wu, Yangyang

arXiv.org Artificial IntelligenceDec-9-2024

Multimodal sentiment analysis (MSA) is an emerging research topic that aims to understand and recognize human sentiment or emotions through multiple modalities. However, in real-world dynamic scenarios, the distribution of target data is always changing and different from the source data used to train the model, which leads to performance degradation. Common adaptation methods usually need source data, which could pose privacy issues or storage overheads. Therefore, test-time adaptation (TTA) methods are introduced to improve the performance of the model at inference time. Existing TTA methods are always based on probabilistic models and unimodal learning, and thus can not be applied to MSA which is often considered as a multimodal regression task. In this paper, we propose two strategies: Contrastive Adaptation and Stable Pseudo-label generation (CASP) for test-time adaptation for multimodal sentiment analysis. The two strategies deal with the distribution shifts for MSA by enforcing consistency and minimizing empirical risk, respectively. Extensive experiments show that CASP brings significant and consistent improvements to the performance of the model across various distribution shift settings and with different backbones, demonstrating its effectiveness and versatility. Our codes are available at https://github.com/zrguo/CASP.

artificial intelligence, natural language, pseudo label, (16 more...)

arXiv.org Artificial Intelligence

2412.07121

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.83)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.83)

Add feedback

Analysing Public Transport User Sentiment on Low Resource Multilingual Data

Myoya, Rozina L., Marivate, Vukosi, Abdulmumin, Idris

arXiv.org Artificial IntelligenceDec-9-2024

Public transport systems in many Sub-Saharan countries often receive less attention compared to other sectors, underscoring the need for innovative solutions to improve the Quality of Service (QoS) and overall user experience. This study explored commuter opinion mining to understand sentiments toward existing public transport systems in Kenya, Tanzania, and South Africa. We used a qualitative research design, analysing data from X (formerly Twitter) to assess sentiments across rail, mini-bus taxis, and buses. By leveraging Multilingual Opinion Mining techniques, we addressed the linguistic diversity and code-switching present in our dataset, thus demonstrating the application of Natural Language Processing (NLP) in extracting insights from under-resourced languages. We employed PLMs such as AfriBERTa, AfroXLMR, AfroLM, and PuoBERTa to conduct the sentiment analysis. The results revealed predominantly negative sentiments in South Africa and Kenya, while the Tanzanian dataset showed mainly positive sentiments due to the advertising nature of the tweets. Furthermore, feature extraction using the Word2Vec model and K-Means clustering illuminated semantic relationships and primary themes found within the different datasets. By prioritising the analysis of user experiences and sentiments, this research paves the way for developing more responsive, user-centered public transport systems in Sub-Saharan countries, contributing to the broader goal of improving urban mobility and sustainability.

artificial intelligence, dataset, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.06951

Country:

Africa > Tanzania > Dar es Salaam Region > Dar es Salaam (0.05)
Africa > South Africa > Gauteng > Pretoria (0.05)
Africa > Kenya > Nairobi City County > Nairobi (0.05)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Rail (1.00)
Transportation > Ground > Road (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)

Add feedback

A Cross-Validation Study of Turkish Sentiment Analysis Datasets and Tools

Çakıcı, Şevval, Karaduman, Dilara, Çırlan, Mehmet Akif, Hürriyetoğlu, Ali

arXiv.org Artificial IntelligenceDec-8-2024

In recent years, sentiment analysis has gained increasing significance, prompting researchers to explore datasets in various languages, including Turkish. However, the limited availability of Turkish datasets has led to their multifaceted usage in different studies, yielding diverse outcomes. To overcome this challenge, a rigorous review was conducted of research articles published between 2012 and 2022. 31 studies were listed, and 23 Turkish datasets obtained from publicly available sources and email requests used in these studies were collected. We labeled these 31 studies using a taxonomy. We provide a map of sentiment analysis datasets according to this taxonomy in Turkish over 10 years. Moreover, we run state-of-the-art sentiment analysis tools on these datasets and analyzed performance across popular Turkish sentiment datasets. We observed that the performance of the sentiment analysis tools significantly depends on the characteristics of the target text. Our study fosters a more nuanced understanding of sentiment analysis in the Turkish language.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.05964

Country:

Europe > United Kingdom > England > Staffordshire (0.04)
Asia > Middle East > Republic of Türkiye > Tokat Province > Tokat (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Services (1.00)
Health & Medicine (0.70)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

BERTCaps: BERT Capsule for Persian Multi-Domain Sentiment Analysis

Memari, Mohammadali, Nejad, Soghra Mikaeyl, Rabiei, Amir Parsa, Eisaei, Mehrshad, Hesaraki, Saba

arXiv.org Artificial IntelligenceDec-7-2024

Multidomain sentiment analysis involves estimating the polarity of an unstructured text by exploiting domain specific information. One of the main issues common to the approaches discussed in the literature is their poor applicability to domains that differ from those used to construct opinion models.This paper aims to present a new method for Persian multidomain SA analysis using deep learning approaches. The proposed BERTCapsules approach consists of a combination of BERT and Capsule models. In this approach, BERT was used for Instance representation, and Capsule Structure was used to learn the extracted graphs. Digikala dataset, including ten domains with both positive and negative polarity, was used to evaluate this approach. The evaluation of the BERTCaps model achieved an accuracy of 0.9712 in sentiment classification binary classification and 0.8509 in domain classification .

machine learning, natural language, sentiment analysis, (19 more...)

arXiv.org Artificial Intelligence

2412.05591

Country:

North America > United States > Indiana > Boone County > Lebanon (0.04)
Europe > Estonia > Tartu County > Tartu (0.04)
Asia > Middle East > Lebanon (0.04)

Genre:

Research Report > Experimental Study (0.46)
Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal Fusion Balancing Through Game-Theoretic Regularization

Kontras, Konstantinos, Strypsteen, Thomas, Chatzichristos, Christos, Liang, Paul Pu, Blaschko, Matthew, De Vos, Maarten

arXiv.org Artificial IntelligenceDec-7-2024

Multimodal learning can complete the picture of information extraction by uncovering key dependencies between data sources. However, current systems fail to fully leverage multiple modalities for optimal performance. This has been attributed to modality competition, where modalities strive for training resources, leaving some underoptimized. We show that current balancing methods struggle to train multimodal models that surpass even simple baselines, such as ensembles. This raises the question: how can we ensure that all modalities in multimodal training are sufficiently trained, and that learning from new modalities consistently improves performance? This paper proposes the Multimodal Competition Regularizer (MCR), a new loss component inspired by mutual information (MI) decomposition designed to prevent the adverse effects of competition in multimodal training. Our key contributions are: 1) Introducing game-theoretic principles in multimodal learning, where each modality acts as a player competing to maximize its influence on the final outcome, enabling automatic balancing of the MI terms. 2) Refining lower and upper bounds for each MI term to enhance the extraction of task-relevant unique and shared information across modalities. 3) Suggesting latent space permutations for conditional MI estimation, significantly improving computational efficiency. MCR outperforms all previously suggested training strategies and is the first to consistently improve multimodal learning beyond the ensemble baseline, clearly demonstrating that combining modalities leads to significant performance gains on both synthetic and large real-world datasets.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.07335

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.47)
Health & Medicine (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models

Roy, Swarnava Sinha, Kundu, Ayan

arXiv.org Artificial IntelligenceDec-5-2024

Integrated Gradients is a well-known technique for explaining deep learning models. It calculates feature importance scores by employing a gradient based approach computing gradients of the model output with respect to input features and accumulating them along a linear path. While this works well for continuous features spaces, it may not be the most optimal way to deal with discrete spaces like word embeddings. For interpreting LLMs (Large Language Models), there exists a need for a non-linear path where intermediate points, whose gradients are to be computed, lie close to actual words in the embedding space. In this paper, we propose a method called Uniform Discretized Integrated Gradients (UDIG) based on a new interpolation strategy where we choose a favorable nonlinear path for computing attribution scores suitable for predictive language models. We evaluate our method on two types of NLP tasks- Sentiment Classification and Question Answering against three metrics viz Log odds, Comprehensiveness and Sufficiency. For sentiment classification, we have used the SST2, IMDb and Rotten Tomatoes datasets for benchmarking and for Question Answering, we have used the fine-tuned BERT model on SQuAD dataset. Our approach outperforms the existing methods in almost all the metrics.

attribution score, integrated gradient, udig, (13 more...)

arXiv.org Artificial Intelligence

2412.03886

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.56)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Multimodal Sentiment Analysis Based on BERT and ResNet

Ren, JiaLe

arXiv.org Artificial IntelligenceDec-4-2024

With the rapid development of the Internet and social media, multi-modal data (text and image) is increasingly important in sentiment analysis tasks. However, the existing methods are difficult to effectively fuse text and image features, which limits the accuracy of analysis. To solve this problem, a multimodal sentiment analysis framework combining BERT and ResNet was proposed. BERT has shown strong text representation ability in natural language processing, and ResNet has excellent image feature extraction performance in the field of computer vision. Firstly, BERT is used to extract the text feature vector, and ResNet is used to extract the image feature representation. Then, a variety of feature fusion strategies are explored, and finally the fusion model based on attention mechanism is selected to make full use of the complementary information between text and image. Experimental results on the public dataset MAVA-single show that compared with the single-modal models that only use BERT or ResNet, the proposed multi-modal model improves the accuracy and F1 score, reaching the best accuracy of 74.5%. This study not only provides new ideas and methods for multimodal sentiment analysis, but also demonstrates the application potential of BERT and ResNet in cross-domain fusion. In the future, more advanced feature fusion techniques and optimization strategies will be explored to further improve the accuracy and generalization ability of multimodal sentiment analysis.

machine learning, natural language, sentiment analysis, (20 more...)

arXiv.org Artificial Intelligence

2412.03625

Country: Asia > China > Yunnan Province > Kunming (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

YT-30M: A multi-lingual multi-category dataset of YouTube comments

Dutta, Hridoy Sankar

arXiv.org Artificial IntelligenceDec-4-2024

This paper introduces two large-scale multilingual comment datasets, YT-30M (and YT-100K) from YouTube. The analysis in this paper is performed on a smaller sample (YT-100K) of YT-30M. Both the datasets: YT-30M (full) and YT-100K (randomly selected 100K sample from YT-30M) are publicly released for further research. YT-30M (YT-100K) contains 32236173 (108694) comments posted by YouTube channel that belong to YouTube categories. Each comment is associated with a video ID, comment ID, commentor name, commentor channel ID, comment text, upvotes, original channel ID and category of the YouTube channel (e.g., 'News & Politics', 'Science & Technology', etc.).

category, dataset, yt-30m, (13 more...)

arXiv.org Artificial Intelligence

2412.03465

Genre: Research Report (0.40)

Industry: Information Technology > Services (0.31)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.48)

Add feedback

Data Uncertainty-Aware Learning for Multimodal Aspect-based Sentiment Analysis

Yang, Hao, Zhang, Zhenyu, Zhao, Yanyan, Qin, Bing

arXiv.org Artificial IntelligenceDec-2-2024

As a fine-grained task, multimodal aspect-based sentiment analysis (MABSA) mainly focuses on identifying aspect-level sentiment information in the text-image pair. However, we observe that it is difficult to recognize the sentiment of aspects in low-quality samples, such as those with low-resolution images that tend to contain noise. And in the real world, the quality of data usually varies for different samples, such noise is called data uncertainty. But previous works for the MABSA task treat different quality samples with the same importance and ignored the influence of data uncertainty. In this paper, we propose a novel data uncertainty-aware multimodal aspect-based sentiment analysis approach, UA-MABSA, which weighted the loss of different samples by the data quality and difficulty. UA-MABSA adopts a novel quality assessment strategy that takes into account both the image quality and the aspect-based cross-modal relevance, thus enabling the model to pay more attention to high-quality and challenging samples. Extensive experiments show that our method achieves state-of-the-art (SOTA) performance on the Twitter-2015 dataset. Further analysis demonstrates the effectiveness of the quality assessment strategy.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.01249

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.96)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.96)
(2 more...)

Add feedback