Mishra, Pruthwik
No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language models
Kumar, Charaka Vinayak, Urlana, Ashok, Kanumolu, Gopichand, Garlapati, Bala Mallikarjunarao, Mishra, Pruthwik
Advancements in Large Language Models (LLMs) have increased the performance of different natural language understanding as well as generation tasks. Although LLMs have breached the state-of-the-art performance in various tasks, they often reflect different forms of bias present in the training data. In the light of this perceived limitation, we provide a unified evaluation of benchmarks using a set of representative LLMs that cover different forms of biases starting from physical characteristics to socio-economic categories. Moreover, we propose five prompting approaches to carry out the bias detection task across different aspects of bias. Further, we formulate three research questions to gain valuable insight in detecting biases in LLMs using different approaches and evaluation metrics across benchmarks. The results indicate that each of the selected LLMs suffer from one or the other form of bias with the LLaMA3.1-8B model being the least biased. Finally, we conclude the paper with the identification of key challenges and possible future directions.
Text2TimeSeries: Enhancing Financial Forecasting through Time Series Prediction Updates with Event-Driven Insights from Large Language Models
Kurisinkel, Litton Jose, Mishra, Pruthwik, Zhang, Yue
Time series models, typically trained on numerical data, are designed to forecast future values. These models often rely on weighted averaging techniques over time intervals. However, real-world time series data is seldom isolated and is frequently influenced by non-numeric factors. For instance, stock price fluctuations are impacted by daily random events in the broader world, with each event exerting a unique influence on price signals. Previously, forecasts in financial markets have been approached in two main ways: either as time-series problems over price sequence or sentiment analysis tasks. The sentiment analysis tasks aim to determine whether news events will have a positive or negative impact on stock prices, often categorizing them into discrete labels. Recognizing the need for a more comprehensive approach to accurately model time series prediction, we propose a collaborative modeling framework that incorporates textual information about relevant events for predictions. Specifically, we leverage the intuition of large language models about future changes to update real number time series predictions. We evaluated the effectiveness of our approach on financial market data.
Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages
Bahad, Sankalp, Mishra, Pruthwik, Arora, Karunesh, Balabantaray, Rakesh Chandra, Sharma, Dipti Misra, Krishnamurthy, Parameswari
Named Entity Recognition (NER) is a useful component in Natural Language Processing (NLP) applications. It is used in various tasks such as Machine Translation, Summarization, Information Retrieval, and Question-Answering systems. The research on NER is centered around English and some other major languages, whereas limited attention has been given to Indian languages. We analyze the challenges and propose techniques that can be tailored for Multilingual Named Entity Recognition for Indian Languages. We present a human annotated named entity corpora of 40K sentences for 4 Indian languages from two of the major Indian language families. Additionally,we present a multilingual model fine-tuned on our dataset, which achieves an F1 score of 0.80 on our dataset on average. We achieve comparable performance on completely unseen benchmark datasets for Indian languages which affirms the usability of our model.
Towards Large Language Model driven Reference-less Translation Evaluation for English and Indian Languages
Mujadia, Vandan, Mishra, Pruthwik, Ahsan, Arafat, Sharma, Dipti Misra
With the primary focus on evaluating the effectiveness of large language models for automatic reference-less translation assessment, this work presents our experiments on mimicking human direct assessment to evaluate the quality of translations in English and Indian languages. We constructed a translation evaluation task where we performed zero-shot learning, in-context example-driven learning, and fine-tuning of large language models to provide a score out of 100, where 100 represents Figure 1: Spearman co-relation: Human translation a perfect translation and 1 represents a poor evaluation vs different reference-less translation translation. We compared the performance of evaluation metrics. Llama-2-7b-Adapt (lora), our trained systems with existing methods such Llama-2-13b-Adapt (lora), Mistral-7b-Adpt (lora), as COMET, BERT-Scorer, and LABSE, and COMET-QE (https://github.com/Unbabel/COMET)
Automatic Data Retrieval for Cross Lingual Summarization
Bhatnagar, Nikhilesh, Urlana, Ashok, Mujadia, Vandan, Mishra, Pruthwik, Sharma, Dipti Misra
Cross-lingual summarization involves the summarization of text written in one language to a different one. There is a body of research addressing cross-lingual summarization from English to other European languages. In this work, we aim to perform cross-lingual summarization from English to Hindi. We propose pairing up the coverage of newsworthy events in textual and video format can prove to be helpful for data acquisition for cross lingual summarization. We analyze the data and propose methods to match articles to video descriptions that serve as document and summary pairs. We also outline filtering methods over reasonable thresholds to ensure the correctness of the summaries. Further, we make available 28,583 mono and cross-lingual article-summary pairs https://github.com/tingc9/Cross-Sum-News-Aligned. We also build and analyze multiple baselines on the collected data and report error analysis.
Verb Categorisation for Hindi Word Problem Solving
Sharma, Harshita, Mishra, Pruthwik, Sharma, Dipti Misra
Word problem Solving is a challenging NLP task that deals with solving mathematical problems described in natural language. Recently, there has been renewed interest in developing word problem solvers for Indian languages. As part of this paper, we have built a Hindi arithmetic word problem solver which makes use of verbs. Additionally, we have created verb categorization data for Hindi. Verbs are very important for solving word problems with addition/subtraction operations as they help us identify the set of operations required to solve the word problems. We propose a rule-based solver that uses verb categorisation to identify operations in a word problem and generate answers for it. To perform verb categorisation, we explore several approaches and present a comparative study.
Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects -- A Survey
Urlana, Ashok, Mishra, Pruthwik, Roy, Tathagato, Mishra, Rahul
Generic text summarization approaches often fail to address the specific intent and needs of individual users. Recently, scholarly attention has turned to the development of summarization methods that are more closely tailored and controlled to align with specific objectives and user needs. While a growing corpus of research is devoted towards a more controllable summarization, there is no comprehensive survey available that thoroughly explores the diverse controllable aspects or attributes employed in this context, delves into the associated challenges, and investigates the existing solutions. In this survey, we formalize the Controllable Text Summarization (CTS) task, categorize controllable aspects according to their shared characteristics and objectives, and present a thorough examination of existing methods and datasets within each category. Moreover, based on our findings, we uncover limitations and research gaps, while also delving into potential solutions and future directions for CTS.
Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages
Prakash, Anusha, Kumar, Arun, Seth, Ashish, Mukherjee, Bhagyashree, Gupta, Ishika, Kuriakose, Jom, Fernandes, Jordan, Vikram, K V, M, Mano Ranjith Kumar, Mary, Metilda Sagaya, Wajahat, Mohammad, N, Mohana, Batra, Mudit, K, Navina, George, Nihal John, Ravi, Nithya, Mishra, Pruthwik, Srivastava, Sudhanshu, Lodagala, Vasista Sai, Mujadia, Vandan, Vineeth, Kada Sai Venkata, Sukhadia, Vrunda, Sharma, Dipti, Murthy, Hema, Bhattacharya, Pushpak, Umesh, S, Sangal, Rajeev
Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages belong to different language families, resulting in differences in generated audio duration. This is further compounded by the original speaker's rhythm, especially for extempore speech. This paper describes the challenges in regenerating English lecture videos in Indian languages semi-automatically. A prototype is developed for dubbing lectures into 9 Indian languages. A mean-opinion-score (MOS) is obtained for two languages, Hindi and Tamil, on two different courses. The output video is compared with the original video in terms of MOS (1-5) and lip synchronisation with scores of 4.09 and 3.74, respectively. The human effort also reduces by 75%.