AITopics | Information Extraction

Collaborating Authors

Information Extraction

News Overviews Instructional Materials AI-Alerts Classics

MIDG: Mixture of Invariant Experts with knowledge injection for Domain Generalization in Multimodal Sentiment Analysis

arXiv.org Artificial IntelligenceDec-9-2025

Existing methods in domain generalization for Multimodal Sentiment Analysis (MSA) often overlook inter-modal synergies during invariant features extraction, which prevents the accurate capture of the rich semantic information within multimodal data. Additionally, while knowledge injection techniques have been explored in MSA, they often suffer from fragmented cross-modal knowledge, overlooking specific representations that exist beyond the confines of unimodal. To address these limitations, we propose a novel MSA framework designed for domain generalization. Firstly, the framework incorporates a Mixture of Invariant Experts model to extract domain-invariant features, thereby enhancing the model's capacity to learn synergistic relationships between modalities. Secondly, we design a Cross-Modal Adapter to augment the semantic richness of multimodal representations through cross-modal knowledge injection. Extensive domain experiments conducted on three datasets demonstrate that the proposed MIDG achieves superior performance.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2512.0743

Country:

North America (0.28)
Oceania > Australia (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.74)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

CMV-Fuse: Cross Modal-View Fusion of AMR, Syntax, and Knowledge Representations for Aspect Based Sentiment Analysis

Sudheendra, Smitha Muthya, Cherukuri, Mani Deep, Srivastava, Jaideep

arXiv.org Artificial IntelligenceDec-9-2025

Natural language understanding inherently depends on integrating multiple complementary perspectives spanning from surface syntax to deep semantics and world knowledge. However, current Aspect-Based Sentiment Analysis (ABSA) systems typically exploit isolated linguistic views, thereby overlooking the intricate interplay between structural representations that humans naturally leverage. We propose CMV-Fuse, a Cross-Modal View fusion framework that emulates human language processing by systematically combining multiple linguistic perspectives. Our approach systematically orchestrates four linguistic perspectives: Abstract Meaning Representations, constituency parsing, dependency syntax, and semantic attention, enhanced with external knowledge integration. Through hierarchical gated attention fusion across local syntactic, intermediate semantic, and global knowledge levels, CMV-Fuse captures both fine-grained structural patterns and broad contextual understanding. A novel structure aware multi-view contrastive learning mechanism ensures consistency across complementary representations while maintaining computational efficiency. Extensive experiments demonstrate substantial improvements over strong baselines on standard benchmarks, with analysis revealing how each linguistic view contributes to more robust sentiment analysis.

artificial intelligence, natural language, text processing, (16 more...)

arXiv.org Artificial Intelligence

2512.06679

Country:

Asia (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)

Add feedback

DashFusion: Dual-stream Alignment with Hierarchical Bottleneck Fusion for Multimodal Sentiment Analysis

Wen, Yuhua, Li, Qifei, Zhou, Yingying, Gao, Yingming, Wen, Zhengqi, Tao, Jianhua, Li, Ya

arXiv.org Artificial IntelligenceDec-8-2025

Multimodal sentiment analysis (MSA) integrates various modalities, such as text, image, and audio, to provide a more comprehensive understanding of sentiment. However, effective MSA is challenged by alignment and fusion issues. Alignment requires synchronizing both temporal and semantic information across modalities, while fusion involves integrating these aligned features into a unified representation. Existing methods often address alignment or fusion in isolation, leading to limitations in performance and efficiency. To tackle these issues, we propose a novel framework called Dual-stream Alignment with Hierarchical Bottleneck Fusion (DashFusion). Firstly, dual-stream alignment module synchronizes multimodal features through temporal and semantic alignment. Temporal alignment employs cross-modal attention to establish frame-level correspondences among multimodal sequences. Semantic alignment ensures consistency across the feature space through contrastive learning. Secondly, supervised contrastive learning leverages label information to refine the modality features. Finally, hierarchical bottleneck fusion progressively integrates multimodal information through compressed bottleneck tokens, which achieves a balance between performance and computational efficiency. We evaluate DashFusion on three datasets: CMU-MOSI, CMU-MOSEI, and CH-SIMS. Experimental results demonstrate that DashFusion achieves state-of-the-art performance across various metrics, and ablation studies confirm the effectiveness of our alignment and fusion techniques. The codes for our experiments are available at https://github.com/ultramarineX/DashFusion.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.05515

Country: Asia > China (0.15)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.88)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.73)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)
(2 more...)

Add feedback

Public Sentiment Analysis of Traffic Management Policies in Knoxville: A Social Media Driven Study

Saha, Shampa, Roy, Shovan

arXiv.org Artificial IntelligenceDec-8-2025

This study presents a comprehensive analysis of public sentiment toward traffic management policies in Knoxville, Tennessee, utilizing social media data from Twitter and Reddit platforms. We collected and analyzed 7906 posts spanning January 2022 to December 2023, employing Valence Aware Dictionary and sEntiment Reasoner (VADER) for sentiment analysis and Latent Dirichlet Allocation (LDA) for topic modeling. Our findings reveal predominantly negative sentiment, with significant variations across platforms and topics. Twitter exhibited more negative sentiment compared to Reddit. Topic modeling identified six distinct themes, with construction-related topics showing the most negative sentiment while general traffic discussions were more positive. Spatiotemporal analysis revealed geographic and temporal patterns in sentiment expression. The research demonstrates social media's potential as a real-time public sentiment monitoring tool for transportation planning and policy evaluation.

artificial intelligence, natural language, sentiment, (15 more...)

arXiv.org Artificial Intelligence

2512.03103

Country: North America > United States > Tennessee > Knox County > Knoxville (0.34)

Genre: Research Report > New Finding (0.67)

Industry:

Government (1.00)
Transportation > Infrastructure & Services (0.95)
Transportation > Ground > Road (0.94)
Law > Statutes (0.69)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Multi-Modal Opinion Integration for Financial Sentiment Analysis using Cross-Modal Attention

Liu, Yujing, Yang, Chen

arXiv.org Artificial IntelligenceDec-4-2025

In recent years, financial sentiment analysis of public opinion has become increasingly important for market forecasting and risk assessment. However, existing methods often struggle to effectively integrate diverse opinion modalities and capture fine-grained interactions across them. This paper proposes an end-to-end deep learning framework that integrates two distinct modalities of financial opinions: recency modality (timely opinions) and popularity modality (trending opinions), through a novel cross-modal attention mechanism specifically designed for financial sentiment analysis. While both modalities consist of textual data, they represent fundamentally different information channels: recency-driven market updates versus popularity-driven collective sentiment. Our model first uses BERT (Chinese-wwm-ext) for feature embedding and then employs our proposed Financial Multi-Head Cross-Attention (FMHCA) structure to facilitate information exchange between these distinct opinion modalities. The processed features are optimized through a transformer layer and fused using multimodal factored bilinear pooling for classification into negative, neutral, and positive sentiment. Extensive experiments on a comprehensive dataset covering 837 companies demonstrate that our approach achieves an accuracy of 83.5%, significantly outperforming baselines including BERT+Transformer by 21 percent. These results highlight the potential of our framework to support more accurate financial decision-making and risk management.

machine learning, natural language, sentiment analysis, (17 more...)

arXiv.org Artificial Intelligence

2512.03464

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.98)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.98)

Add feedback

Robust Multimodal Sentiment Analysis of Image-Text Pairs by Distribution-Based Feature Recovery and Fusion

Wu, Daiqing, Yang, Dongbao, Zhou, Yu, Ma, Can

arXiv.org Artificial IntelligenceDec-4-2025

As posts on social media increase rapidly, analyzing the sentiments embedded in image-text pairs has become a popular research topic in recent years. Although existing works achieve impressive accomplishments in simultaneously harnessing image and text information, they lack the considerations of possible low-quality and missing modalities. In real-world applications, these issues might frequently occur, leading to urgent needs for models capable of predicting sentiment robustly. Therefore, we propose a Distribution-based feature Recovery and Fusion (DRF) method for robust multimodal sentiment analysis of image-text pairs. Specifically, we maintain a feature queue for each modality to approximate their feature distributions, through which we can simultaneously handle low-quality and missing modalities in a unified framework. For low-quality modalities, we reduce their contributions to the fusion by quantitatively estimating modality qualities based on the distributions. For missing modalities, we build inter-modal mapping relationships supervised by samples and distributions, thereby recovering the missing modalities from available ones. In experiments, two disruption strategies that corrupt and discard some modalities in samples are adopted to mimic the low-quality and missing modalities in various real-world scenarios. Through comprehensive experiments on three publicly available image-text datasets, we demonstrate the universal improvements of DRF compared to SOTA methods under both two strategies, validating its effectiveness in robust multimodal sentiment analysis.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3664647.3680653

2511.18751

Country:

Europe (1.00)
Asia (1.00)
North America > United States > New York (0.28)
North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

TriLex: A Framework for Multilingual Sentiment Analysis in Low-Resource South African Languages

Nkongolo, Mike, Vorster, Hilton, Warren, Josh, Naick, Trevor, Vanmali, Deandre, Mashapha, Masana, Brand, Luke, Fernandes, Alyssa, Calitz, Janco, Makhoba, Sibusiso

arXiv.org Artificial IntelligenceDec-3-2025

Low-resource African languages remain underrepresented in sentiment analysis research, resulting in limited lexical resources and reduced model performance in multilingual applications. This gap restricts equitable access to Natural Language Processing (NLP) technologies and hinders downstream tasks such as public-health monitoring, digital governance, and financial inclusion. To address this challenge, this paper introduces TriLex, a three-stage retrieval-augmented framework that integrates corpus-based extraction, cross-lingual mapping, and Retrieval-Augmented Generation (RAG) driven lexicon refinement for scalable sentiment lexicon expansion in low-resource languages. Using an expanded lexicon, we evaluate two leading African language models (AfroXLMR and AfriBERTa) across multiple case studies. Results show that AfroXLMR consistently achieves the strongest performance, with F1-scores exceeding 80% for isiXhosa and isiZulu, aligning with previously reported ranges (71-75%), and demonstrating high multilingual stability with narrow confidence intervals. AfriBERTa, despite lacking pre-training on the target languages, attains moderate but reliable F1-scores around 64%, confirming its effectiveness under constrained computational settings. Comparative analysis shows that both models outperform traditional machine learning baselines, while ensemble evaluation combining AfroXLMR variants indicates complementary improvements in precision and overall stability. These findings confirm that the TriLex framework, together with AfroXLMR and AfriBERTa, provides a robust and scalable approach for sentiment lexicon development and multilingual sentiment analysis in low-resource South African languages.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2512.02799

Country: Africa (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Public Health (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

PSA-MF: Personality-Sentiment Aligned Multi-Level Fusion for Multimodal Sentiment Analysis

Xie, Heng, Zhu, Kang, Wen, Zhengqi, Tao, Jianhua, Liu, Xuefei, Fu, Ruibo, Li, Changsheng

arXiv.org Artificial IntelligenceDec-2-2025

Multimodal sentiment analysis (MSA) is a research field that recognizes human sentiments by combining textual, visual, and audio modalities. The main challenge lies in integrating sentiment-related information from different modalities, which typically arises during the unimodal feature extraction phase and the multimodal feature fusion phase. Existing methods extract only shallow information from unimodal features during the extraction phase, neglecting sentimental differences across different personalities. During the fusion phase, they directly merge the feature information from each modality without considering differences at the feature level. This ultimately affects the model's recognition performance. To address this problem, we propose a personality-sentiment aligned multi-level fusion framework. We introduce personality traits during the feature extraction phase and propose a novel personality-sentiment alignment method to obtain personalized sentiment embeddings from the textual modality for the first time. In the fusion phase, we introduce a novel multi-level fusion method. This method gradually integrates sentimental information from textual, visual, and audio modalities through multimodal pre-fusion and a multi-level enhanced fusion strategy. Our method has been evaluated through multiple experiments on two commonly used datasets, achieving state-of-the-art results.

artificial intelligence, data mining, natural language, (17 more...)

arXiv.org Artificial Intelligence

2512.01442

Country: Asia > China (0.48)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.75)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.67)
(2 more...)

Add feedback

DyFuLM: An Advanced Multimodal Framework for Sentiment Analysis

Zhou, Ruohan, Yuan, Jiachen, Yang, Churui, Huang, Wenzheng, Zhang, Guoyan, Wei, Shiyao, Hu, Jiazhen, Xin, Ning, Hasan, Md Maruf

arXiv.org Artificial IntelligenceDec-2-2025

Understanding sentiment in complex textual expressions remains a fundamental challenge in affective computing. To address this, we propose a Dynamic Fusion Learning Model (DyFuLM), a multimodal framework designed to capture both hierarchical semantic representations and fine-grained emotional nuances. DyFuLM introduces two key moodules: a Hierarchical Dynamic Fusion module that adaptively integrates multi-level features, and a Gated Feature Aggregation module that regulates cross-layer information ffow to achieve balanced representation learning. Comprehensive experiments on multi-task sentiment datasets demonstrate that DyFuLM achieves 82.64% coarse-grained and 68.48% fine-grained accuracy, yielding the lowest regression errors (MAE = 0.0674, MSE = 0.0082) and the highest R^2 coefficient of determination (R^2= 0.6903). Furthermore, the ablation study validates the effectiveness of each module in DyFuLM. When all modules are removed, the accuracy drops by 0.91% for coarse-grained and 0.68% for fine-grained tasks. Keeping only the gated fusion module causes decreases of 0.75% and 0.55%, while removing the dynamic loss mechanism results in drops of 0.78% and 0.26% for coarse-grained and fine-grained sentiment classification, respectively. These results demonstrate that each module contributes significantly to feature interaction and task balance. Overall, the experimental findings further validate that DyFuLM enhances sentiment representation and overall performance through effective hierarchical feature fusion.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2512.0141

Country: Asia > China (0.48)

Genre: Research Report > New Finding (0.88)

Industry: Consumer Products & Services (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MARSAD: A Multi-Functional Tool for Real-Time Social Media Analysis

Biswas, Md. Rafiul, Alam, Firoj, Zaghouani, Wajdi

arXiv.org Artificial IntelligenceDec-2-2025

MARSAD is a multifunctional natural language processing (NLP) platform designed for real-time social media monitoring and analysis, with a particular focus on the Arabic-speaking world. It enables researchers and non-technical users alike to examine both live and archived social media content, producing detailed visualizations and reports across various dimensions, including sentiment analysis, emotion analysis, propaganda detection, fact-checking, and hate speech detection. The platform also provides secure data-scraping capabilities through API keys for accessing public social media data. MARSAD's backend architecture integrates flexible document storage with structured data management, ensuring efficient processing of large and multimodal datasets. Its user-friendly frontend supports seamless data upload and interaction.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2512.01369

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback