AITopics | Mishra, Pruthwik

Collaborating Authors

Mishra, Pruthwik

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language models

Kumar, Charaka Vinayak, Urlana, Ashok, Kanumolu, Gopichand, Garlapati, Bala Mallikarjunarao, Mishra, Pruthwik

arXiv.org Artificial IntelligenceMar-14-2025

Advancements in Large Language Models (LLMs) have increased the performance of different natural language understanding as well as generation tasks. Although LLMs have breached the state-of-the-art performance in various tasks, they often reflect different forms of bias present in the training data. In the light of this perceived limitation, we provide a unified evaluation of benchmarks using a set of representative LLMs that cover different forms of biases starting from physical characteristics to socio-economic categories. Moreover, we propose five prompting approaches to carry out the bias detection task across different aspects of bias. Further, we formulate three research questions to gain valuable insight in detecting biases in LLMs using different approaches and evaluation metrics across benchmarks. The results indicate that each of the selected LLMs suffer from one or the other form of bias with the LLaMA3.1-8B model being the least biased. Finally, we conclude the paper with the identification of key challenges and possible future directions.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2503.11985

Country:

North America (1.00)
Europe (1.00)
Africa (1.00)
(2 more...)

Genre: Research Report > New Finding (0.88)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Text2TimeSeries: Enhancing Financial Forecasting through Time Series Prediction Updates with Event-Driven Insights from Large Language Models

Kurisinkel, Litton Jose, Mishra, Pruthwik, Zhang, Yue

arXiv.org Artificial IntelligenceJul-4-2024

Time series models, typically trained on numerical data, are designed to forecast future values. These models often rely on weighted averaging techniques over time intervals. However, real-world time series data is seldom isolated and is frequently influenced by non-numeric factors. For instance, stock price fluctuations are impacted by daily random events in the broader world, with each event exerting a unique influence on price signals. Previously, forecasts in financial markets have been approached in two main ways: either as time-series problems over price sequence or sentiment analysis tasks. The sentiment analysis tasks aim to determine whether news events will have a positive or negative impact on stock prices, often categorizing them into discrete labels. Recognizing the need for a more comprehensive approach to accurately model time series prediction, we propose a collaborative modeling framework that incorporates textual information about relevant events for predictions. Specifically, we leverage the intuition of large language models about future changes to update real number time series predictions. We evaluated the effectiveness of our approach on financial market data.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2407.03689

Country:

North America > United States > Ohio (0.28)
North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.14)

Genre:

Financial News (1.00)
Press Release (0.93)
Research Report > New Finding (0.46)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages

Bahad, Sankalp, Mishra, Pruthwik, Arora, Karunesh, Balabantaray, Rakesh Chandra, Sharma, Dipti Misra, Krishnamurthy, Parameswari

arXiv.org Artificial IntelligenceMay-10-2024

Named Entity Recognition (NER) is a useful component in Natural Language Processing (NLP) applications. It is used in various tasks such as Machine Translation, Summarization, Information Retrieval, and Question-Answering systems. The research on NER is centered around English and some other major languages, whereas limited attention has been given to Indian languages. We analyze the challenges and propose techniques that can be tailored for Multilingual Named Entity Recognition for Indian Languages. We present a human annotated named entity corpora of 40K sentences for 4 Indian languages from two of the major Indian language families. Additionally,we present a multilingual model fine-tuned on our dataset, which achieves an F1 score of 0.80 on our dataset on average. We achieve comparable performance on completely unseen benchmark datasets for Indian languages which affirms the usability of our model.

artificial intelligence, information retrieval, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.04829

Country:

Asia (0.69)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Towards Large Language Model driven Reference-less Translation Evaluation for English and Indian Languages

Mujadia, Vandan, Mishra, Pruthwik, Ahsan, Arafat, Sharma, Dipti Misra

arXiv.org Artificial IntelligenceApr-3-2024

With the primary focus on evaluating the effectiveness of large language models for automatic reference-less translation assessment, this work presents our experiments on mimicking human direct assessment to evaluate the quality of translations in English and Indian languages. We constructed a translation evaluation task where we performed zero-shot learning, in-context example-driven learning, and fine-tuning of large language models to provide a score out of 100, where 100 represents Figure 1: Spearman co-relation: Human translation a perfect translation and 1 represents a poor evaluation vs different reference-less translation translation. We compared the performance of evaluation metrics. Llama-2-7b-Adapt (lora), our trained systems with existing methods such Llama-2-13b-Adapt (lora), Mistral-7b-Adpt (lora), as COMET, BERT-Scorer, and LABSE, and COMET-QE (https://github.com/Unbabel/COMET)

large language model, machine learning, translation, (16 more...)

arXiv.org Artificial Intelligence

2404.02512

Country:

Asia > India (0.28)
Asia > Middle East > UAE (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Automatic Data Retrieval for Cross Lingual Summarization

Bhatnagar, Nikhilesh, Urlana, Ashok, Mujadia, Vandan, Mishra, Pruthwik, Sharma, Dipti Misra

arXiv.org Artificial IntelligenceDec-22-2023

Cross-lingual summarization involves the summarization of text written in one language to a different one. There is a body of research addressing cross-lingual summarization from English to other European languages. In this work, we aim to perform cross-lingual summarization from English to Hindi. We propose pairing up the coverage of newsworthy events in textual and video format can prove to be helpful for data acquisition for cross lingual summarization. We analyze the data and propose methods to match articles to video descriptions that serve as document and summary pairs. We also outline filtering methods over reasonable thresholds to ensure the correctness of the summaries. Further, we make available 28,583 mono and cross-lingual article-summary pairs https://github.com/tingc9/Cross-Sum-News-Aligned. We also build and analyze multiple baselines on the collected data and report error analysis.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2312.14542

Country:

Asia > India (0.96)
Europe (0.68)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.50)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Verb Categorisation for Hindi Word Problem Solving

Sharma, Harshita, Mishra, Pruthwik, Sharma, Dipti Misra

arXiv.org Artificial IntelligenceDec-18-2023

Word problem Solving is a challenging NLP task that deals with solving mathematical problems described in natural language. Recently, there has been renewed interest in developing word problem solvers for Indian languages. As part of this paper, we have built a Hindi arithmetic word problem solver which makes use of verbs. Additionally, we have created verb categorization data for Hindi. Verbs are very important for solving word problems with addition/subtraction operations as they help us identify the set of operations required to solve the word problems. We propose a rule-based solver that uses verb categorisation to identify operations in a word problem and generate answers for it. To perform verb categorisation, we explore several approaches and present a comparative study.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2312.11395

Country:

Asia (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.49)

Add feedback

Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects -- A Survey

Urlana, Ashok, Mishra, Pruthwik, Roy, Tathagato, Mishra, Rahul

arXiv.org Artificial IntelligenceNov-15-2023

Generic text summarization approaches often fail to address the specific intent and needs of individual users. Recently, scholarly attention has turned to the development of summarization methods that are more closely tailored and controlled to align with specific objectives and user needs. While a growing corpus of research is devoted towards a more controllable summarization, there is no comprehensive survey available that thoroughly explores the diverse controllable aspects or attributes employed in this context, delves into the associated challenges, and investigates the existing solutions. In this survey, we formalize the Controllable Text Summarization (CTS) task, categorize controllable aspects according to their shared characteristics and objectives, and present a thorough examination of existing methods and datasets within each category. Moreover, based on our findings, we uncover limitations and research gaps, while also delving into potential solutions and future directions for CTS.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2311.09212

Country:

Europe (1.00)
Asia > Middle East (0.68)
North America > United States > Minnesota (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Media > News (0.46)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages

Prakash, Anusha, Kumar, Arun, Seth, Ashish, Mukherjee, Bhagyashree, Gupta, Ishika, Kuriakose, Jom, Fernandes, Jordan, Vikram, K V, M, Mano Ranjith Kumar, Mary, Metilda Sagaya, Wajahat, Mohammad, N, Mohana, Batra, Mudit, K, Navina, George, Nihal John, Ravi, Nithya, Mishra, Pruthwik, Srivastava, Sudhanshu, Lodagala, Vasista Sai, Mujadia, Vandan, Vineeth, Kada Sai Venkata, Sukhadia, Vrunda, Sharma, Dipti, Murthy, Hema, Bhattacharya, Pushpak, Umesh, S, Sangal, Rajeev

arXiv.org Artificial IntelligenceNov-1-2022

Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages belong to different language families, resulting in differences in generated audio duration. This is further compounded by the original speaker's rhythm, especially for extempore speech. This paper describes the challenges in regenerating English lecture videos in Indian languages semi-automatically. A prototype is developed for dubbing lectures into 9 Indian languages. A mean-opinion-score (MOS) is obtained for two languages, Hindi and Tamil, on two different courses. The output video is compared with the original video in terms of MOS (1-5) and lip synchronisation with scores of 4.09 and 3.74, respectively. The human effort also reduces by 75%.

artificial intelligence, machine translation, natural language, (18 more...)

arXiv.org Artificial Intelligence

2211.01338

Country:

Asia (0.30)
North America > United States (0.29)

Genre:

Research Report (0.64)
Instructional Material (0.47)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.71)

Add feedback