HiFACTMix: A Code-Mixed Benchmark and Graph-Aware Model for EvidenceBased Political Claim Verification in Hinglish

Thakur, Rakesh, Sharma, Sneha, Chopra, Gauri

arXiv.org Artificial Intelligence 

Fact-checking in code-mixed, low-resource languages such as Hinglish remains an underexplored challenge in natural language processing. Existing fact-verification systems largely focus on high-resource, monolingual settings and fail to generalize to real-world political discourse in linguis - tically diverse regions like India. Given the widespread use of Hinglish by public figures, particularly political figures, and the growing influence of social media on public opin - ion, there's a critical need for robust, multilingual and con - text-aware fact-checking tools. To address this gap a novel benchmark HiFACT dataset is introduced with 1,500 real-world factual claims made by 28 Indian state Chief Minis - ters in Hinglish, under a highly code-mixed low-resource setting. Each claim is annotated with textual evidence and veracity labels. To evaluate this benchmark, a novel graph-aware, retrieval-augmented fact-checking model is proposed that combines multilingual contextual encoding, claim-evi - dence semantic alignment, evidence graph construction, graph neural reasoning, and natural language explanation generation. Experimental results show that HiFACTMix outperformed accuracy in comparison to state of art multi - lingual baselines models and provides faithful justifications for its verdicts. This work opens a new direction for multi - lingual, code-mixed, and politically grounded fact verifica - tion research..

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found