Mishra, Rahul
HalluCounter: Reference-free LLM Hallucination Detection in the Wild!
Urlana, Ashok, Kanumolu, Gopichand, Kumar, Charaka Vinayak, Garlapati, Bala Mallikarjunarao, Mishra, Rahul
Response consistency-based, reference-free hallucination detection (RFHD) methods do not depend on internal model states, such as generation probabilities or gradients, which Grey-box models typically rely on but are inaccessible in closed-source LLMs. However, their inability to capture query-response alignment patterns often results in lower detection accuracy. Additionally, the lack of large-scale benchmark datasets spanning diverse domains remains a challenge, as most existing datasets are limited in size and scope. To this end, we propose HalluCounter, a novel reference-free hallucination detection method that utilizes both response-response and query-response consistency and alignment patterns. This enables the training of a classifier that detects hallucinations and provides a confidence score and an optimal response for user queries. Furthermore, we introduce HalluCounterEval, a benchmark dataset comprising both synthetically generated and human-curated samples across multiple domains. Our method outperforms state-of-the-art approaches by a significant margin, achieving over 90\% average confidence in hallucination detection across datasets.
SciClaimHunt: A Large Dataset for Evidence-based Scientific Claim Verification
Kumar, Sujit, Sharma, Anshul, Khincha, Siddharth Hemant, Shroff, Gargi, Singh, Sanasam Ranbir, Mishra, Rahul
Verifying scientific claims presents a significantly greater challenge than verifying political or news-related claims. Unlike the relatively broad audience for political claims, the users of scientific claim verification systems can vary widely, ranging from researchers testing specific hypotheses to everyday users seeking information on a medication. Additionally, the evidence for scientific claims is often highly complex, involving technical terminology and intricate domain-specific concepts that require specialized models for accurate verification. Despite considerable interest from the research community, there is a noticeable lack of large-scale scientific claim verification datasets to benchmark and train effective models. To bridge this gap, we introduce two large-scale datasets, SciClaimHunt and SciClaimHunt_Num, derived from scientific research papers. We propose several baseline models tailored for scientific claim verification to assess the effectiveness of these datasets. Additionally, we evaluate models trained on SciClaimHunt and SciClaimHunt_Num against existing scientific claim verification datasets to gauge their quality and reliability. Furthermore, we conduct human evaluations of the claims in proposed datasets and perform error analysis to assess the effectiveness of the proposed baseline models. Our findings indicate that SciClaimHunt and SciClaimHunt_Num serve as highly reliable resources for training models in scientific claim verification.
One Arrow, Many Targets: Probing LLMs for Multi-Attribute Controllable Text Summarization
Roy, Tathagato, Mishra, Rahul
Text summarization is a well-established task within the natural language processing (NLP) community. However, the focus on controllable summarization tailored to user requirements is gaining traction only recently. While several efforts explore controllability in text summarization, the investigation of Multi-Attribute Controllable Summarization (MACS) remains limited. This work addresses this gap by examining the MACS task through the lens of large language models (LLMs), using various learning paradigms, particularly low-rank adapters. We experiment with different popular adapter fine-tuning strategies to assess the effectiveness of the resulting models in retaining cues and patterns associated with multiple controllable attributes. Additionally, we propose and evaluate a novel hierarchical adapter fusion technique to integrate learnings from two distinct controllable attributes. Subsquently, we present our findings, discuss the challenges encountered, and suggest potential avenues for advancing the MACS task.
SceneGraMMi: Scene Graph-boosted Hybrid-fusion for Multi-Modal Misinformation Veracity Prediction
Joshi, Swarang, Mavani, Siddharth, Alex, Joel, Negi, Arnav, Mishra, Rahul, Kumaraguru, Ponnurangam
Misinformation undermines individual knowledge and affects broader societal narratives. Despite growing interest in the research community in multi-modal misinformation detection, existing methods exhibit limitations in capturing semantic cues, key regions, and cross-modal similarities within multi-modal datasets. We propose SceneGraMMi, a Scene Graph-boosted Hybrid-fusion approach for Multi-modal Misinformation veracity prediction, which integrates scene graphs across different modalities to improve detection performance. Experimental results across four benchmark datasets show that SceneGraMMi consistently outperforms state-of-the-art methods. In a comprehensive ablation study, we highlight the contribution of each component, while Shapley values are employed to examine the explainability of the model's decision-making process.
KTCR: Improving Implicit Hate Detection with Knowledge Transfer driven Concept Refinement
Garg, Samarth, Kavuri, Vivek Hruday, Shroff, Gargi, Mishra, Rahul
The constant shifts in social and political contexts, driven by emerging social movements and political events, lead to new forms of hate content and previously unrecognized hate patterns that machine learning models may not have captured. Some recent literature proposes the data augmentation-based techniques to enrich existing hate datasets by incorporating samples that reveal new implicit hate patterns. This approach aims to improve the model's performance on out-of-domain implicit hate instances. It is observed, that further addition of more samples for augmentation results in the decrease of the performance of the model. In this work, we propose a Knowledge Transfer-driven Concept Refinement method that distills and refines the concepts related to implicit hate samples through novel prototype alignment and concept losses, alongside data augmentation based on concept activation vectors. Experiments with several publicly available datasets show that incorporating additional implicit samples reflecting new hate patterns through concept refinement enhances the model's performance, surpassing baseline results while maintaining cross-dataset generalization capabilities.\footnote{DISCLAIMER: This paper contains explicit statements that are potentially offensive.}
DiscoGraMS: Enhancing Movie Screen-Play Summarization using Movie Character-Aware Discourse Graph
Chitale, Maitreya Prafulla, Bindal, Uday, Rajkumar, Rajakrishnan, Mishra, Rahul
Summarizing movie screenplays presents a unique set of challenges compared to standard document summarization. Screenplays are not only lengthy, but also feature a complex interplay of characters, dialogues, and scenes, with numerous direct and subtle relationships and contextual nuances that are difficult for machine learning models to accurately capture and comprehend. Recent attempts at screenplay summarization focus on fine-tuning transformer-based pre-trained models, but these models often fall short in capturing long-term dependencies and latent relationships, and frequently encounter the "lost in the middle" issue. To address these challenges, we introduce DiscoGraMS, a novel resource that represents movie scripts as a movie character-aware discourse graph (CaD Graph). This approach is well-suited for various downstream tasks, such as summarization, question-answering, and salience detection. The model aims to preserve all salient information, offering a more comprehensive and faithful representation of the screenplay's content. We further explore a baseline method that combines the CaD Graph with the corresponding movie script through a late fusion of graph and text modalities, and we present very initial promising results.
Utilizing Transfer Learning and pre-trained Models for Effective Forest Fire Detection: A Case Study of Uttarakhand
Gupta, Hari Prabhat, Mishra, Rahul
--Forest fires pose a significant threat to the environment, human life, and property. Early detection and response are crucial to mitigating the impact of these disasters. However, traditional forest fire detection methods are often hindered by our reliability on manual observation and satellite imagery with low spatial resolution. This paper emphasizes the role of transfer learning in enhancing forest fire detection in India, particularly in overcoming data collection challenges and improving model accuracy across various regions. We compare traditional learning methods with transfer learning, focusing on the unique challenges posed by regional differences in terrain, climate, and vegetation. Transfer learning can be categorized into several types based on the similarity between the source and target tasks, as well as the type of knowledge transferred. One key method is utilizing pre-trained models for efficient transfer learning, which significantly reduces the need for extensive labeled data. We outline the transfer learning process, demonstrating how researchers can adapt pre-trained models like MobileNetV2 for specific tasks such as forest fire detection. India is home to a vast and diverse range of forests, covering over 70 million hectares of land [1]. These forests are crucial not only for the country's ecosystem and biodiversity but also provide livelihoods for millions of people, particularly in rural areas. However, India's forests are facing a growing threat from forest fires, which can have devastating consequences for the environment, human life, and property [2]. Forest fires are a major concern in India, particularly during the summer months when temperatures are high and humidity is low. According to the Indian government, forest fires affect over 50, 000 hectares of land annually, causing significant economic losses and damage to the environment [3]. The country's forests are also home to a wide range of wildlife, including many endangered species which are threatened by fires. Figure 1 illustrates some images of the Uttarakhand, India, forest fire. Early detection and response are critical to mitigating the impact of forest fires. Traditional methods of forest fire detection, such as manual observation and satellite imagery with low spatial resolution, are often limited in their ability to detect fires quickly and accurately [4]. Manual observation is time-consuming and labour-intensive and may not be feasible in remote or inaccessible areas [5]. Satellite imagery with low spatial resolution may not be able to detect small fires or fires in areas with dense vegetation. In recent years, advances in deep learning and computer vision have enabled the development of more effective methods for forest fire detection. Convolutional neural networks (CNNs), in particular, have shown great promise in image classification tasks [6]-[10], including fire detection [4].
LimGen: Probing the LLMs for Generating Suggestive Limitations of Research Papers
Faizullah, Abdur Rahman Bin Md, Urlana, Ashok, Mishra, Rahul
Examining limitations is a crucial step in the scholarly research reviewing process, revealing aspects where a study might lack decisiveness or require enhancement. This aids readers in considering broader implications for further research. In this article, we present a novel and challenging task of Suggestive Limitation Generation (SLG) for research papers. We compile a dataset called \textbf{\textit{LimGen}}, encompassing 4068 research papers and their associated limitations from the ACL anthology. We investigate several approaches to harness large language models (LLMs) for producing suggestive limitations, by thoroughly examining the related challenges, practical insights, and potential opportunities. Our LimGen dataset and code can be accessed at \url{https://github.com/arbmf/LimGen}.
Exploring News Summarization and Enrichment in a Highly Resource-Scarce Indian Language: A Case Study of Mizo
Bala, Abhinaba, Urlana, Ashok, Mishra, Rahul, Krishnamurthy, Parameswari
Obtaining sufficient information in one's mother tongue is crucial for satisfying the information needs of the users. While high-resource languages have abundant online resources, the situation is less than ideal for very low-resource languages. Moreover, the insufficient reporting of vital national and international events continues to be a worry, especially in languages with scarce resources, like \textbf{Mizo}. In this paper, we conduct a study to investigate the effectiveness of a simple methodology designed to generate a holistic summary for Mizo news articles, which leverages English-language news to supplement and enhance the information related to the corresponding news events. Furthermore, we make available 500 Mizo news articles and corresponding enriched holistic summaries. Human evaluation confirms that our approach significantly enhances the information coverage of Mizo news articles. The mizo dataset and code can be accessed at \url{https://github.com/barvin04/mizo_enrichment
LLMs with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey
Urlana, Ashok, Kumar, Charaka Vinayak, Singh, Ajeet Kumar, Garlapati, Bala Mallikarjunarao, Chalamala, Srinivasa Rao, Mishra, Rahul
Large language models (LLMs) have become the secret ingredient driving numerous industrial applications, showcasing their remarkable versatility across a diverse spectrum of tasks. From natural language processing and sentiment analysis to content generation and personalized recommendations, their unparalleled adaptability has facilitated widespread adoption across industries. This transformative shift driven by LLMs underscores the need to explore the underlying associated challenges and avenues for enhancement in their utilization. In this paper, our objective is to unravel and evaluate the obstacles and opportunities inherent in leveraging LLMs within an industrial context. To this end, we conduct a survey involving a group of industry practitioners, develop four research questions derived from the insights gathered, and examine 68 industry papers to address these questions and derive meaningful conclusions.