AITopics | Varma, Vasudeva

Plotting

Varma, Vasudeva

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Curriculum Learning for Cross-Lingual Data-to-Text Generation With Noisy Data

Hari, Kancharla Aditya, Gupta, Manish, Varma, Vasudeva

arXiv.org Artificial IntelligenceDec-17-2024

Curriculum learning has been used to improve the quality of text generation systems by ordering the training samples according to a particular schedule in various tasks. In the context of data-to-text generation (DTG), previous studies used various difficulty criteria to order the training samples for monolingual DTG. These criteria, however, do not generalize to the crosslingual variant of the problem and do not account for noisy data. We explore multiple criteria that can be used for improving the performance of cross-lingual DTG systems with noisy data using two curriculum schedules. Using the alignment score criterion for ordering samples and an annealing schedule to train the model, we show increase in BLEU score by up to 4 points, and improvements in faithfulness and coverage of generations by 5-15% on average across 11 Indian languages and English in 2 separate datasets. We make code and data publicly available

computational linguistic, data quality, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.13484

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Data Science > Data Quality > Data Cleaning (0.81)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.67)

Add feedback

Towards Understanding the Robustness of LLM-based Evaluations under Perturbations

Chaudhary, Manav, Gupta, Harshit, Bhat, Savita, Varma, Vasudeva

arXiv.org Artificial IntelligenceDec-12-2024

Traditional evaluation metrics like BLEU and ROUGE fall short when capturing the nuanced qualities of generated text, particularly when there is no single ground truth. In this paper, we explore the potential of Large Language Models (LLMs), specifically Google Gemini 1, to serve as automatic evaluators for non-standardized metrics in summarization and dialog-based tasks. We conduct experiments across multiple prompting strategies to examine how LLMs fare as quality evaluators when compared with human judgments on the SummEval and USR datasets, asking the model to generate both a score as well as a justification for the score. Furthermore, we explore the robustness of the LLM evaluator by using perturbed inputs. Our findings suggest that while LLMs show promise, their alignment with human evaluators is limited, they are not robust against perturbations and significant improvements are required for their standalone use as reliable evaluators for subjective metrics.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.09269

Country: Asia > Indonesia (0.14)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

iREL at SemEval-2024 Task 9: Improving Conventional Prompting Methods for Brain Teasers

Gupta, Harshit, Chaudhary, Manav, Raha, Tathagata, Subramanian, Shivansh, Varma, Vasudeva

arXiv.org Artificial IntelligenceMay-25-2024

This paper describes our approach for SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense. The BRAINTEASER task comprises multiple-choice Question Answering designed to evaluate the models' lateral thinking capabilities. It consists of Sentence Puzzle and Word Puzzle subtasks that require models to defy default common-sense associations and exhibit unconventional thinking. We propose a unique strategy to improve the performance of pre-trained language models, notably the Gemini 1.0 Pro Model, in both subtasks. We employ static and dynamic few-shot prompting techniques and introduce a model-generated reasoning strategy that utilizes the LLM's reasoning capabilities to improve performance. Our approach demonstrated significant improvements, showing that it performed better than the baseline models by a considerable margin but fell short of performing as well as the human annotators, thus highlighting the efficacy of the proposed strategies.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.16129

Country: North America > Mexico (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.87)

Add feedback

BrainStorm @ iREL at SMM4H 2024: Leveraging Translation and Topical Embeddings for Annotation Detection in Tweets

Chaudhary, Manav, Gupta, Harshit, Varma, Vasudeva

arXiv.org Artificial IntelligenceMay-18-2024

The proliferation of LLMs in various NLP tasks has sparked debates regarding their reliability, particularly in annotation tasks where biases and hallucinations may arise. In this shared task, we address the challenge of distinguishing annotations made by LLMs from those made by human domain experts in the context of COVID-19 symptom detection from tweets in Latin American Spanish. This paper presents BrainStorm @ iREL's approach to the SMM4H 2024 Shared Task, leveraging the inherent topical information in tweets, we propose a novel approach to identify and classify annotations, aiming to enhance the trustworthiness of annotated data.

large language model, natural language, tweet, (13 more...)

arXiv.org Artificial Intelligence

2405.11192

Country: Asia > Thailand (0.15)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.99)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MetaCheckGPT -- A Multi-task Hallucination Detector Using LLM Uncertainty and Meta-models

Mehta, Rahul, Hoblitzell, Andrew, O'Keefe, Jack, Jang, Hyeju, Varma, Vasudeva

arXiv.org Artificial IntelligenceApr-11-2024

Hallucinations in large language models (LLMs) have recently become a significant problem. A recent effort in this direction is a shared task at Semeval 2024 Task 6, SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes. This paper describes our winning solution ranked 1st and 2nd in the 2 sub-tasks of model agnostic and model aware tracks respectively. We propose a meta-regressor framework of LLMs for model evaluation and integration that achieves the highest scores on the leaderboard. We also experiment with various transformer-based models and black box methods like ChatGPT, Vectara, and others. In addition, we perform an error analysis comparing GPT4 against our best model which shows the limitations of the former.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.06948

Country:

North America > United States > Indiana (0.14)
North America > Mexico > Mexico City (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multilingual Bias Detection and Mitigation for Indian Languages

Maity, Ankita, Sharma, Anubhav, Dhar, Rudra, Abhishek, Tushar, Gupta, Manish, Varma, Vasudeva

arXiv.org Artificial IntelligenceDec-23-2023

Lack of diverse perspectives causes neutrality bias in Wikipedia content leading to millions of worldwide readers getting exposed by potentially inaccurate information. Hence, neutrality bias detection and mitigation is a critical problem. Although previous studies have proposed effective solutions for English, no work exists for Indian languages. First, we contribute two large datasets, mWikiBias and mWNC, covering 8 languages, for the bias detection and mitigation tasks respectively. Next, we investigate the effectiveness of popular multilingual Transformer-based models for the two tasks by modeling detection as a binary classification problem and mitigation as a style transfer problem. We make the code and data publicly available.

large language model, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2312.15181

Country:

North America > United States > Pennsylvania (0.14)
North America > United States > Michigan (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

Neural models for Factual Inconsistency Classification with Explanations

Raha, Tathagata, Choudhary, Mukund, Menon, Abhinav, Gupta, Harshit, Srivatsa, KV Aditya, Gupta, Manish, Varma, Vasudeva

arXiv.org Artificial IntelligenceJun-15-2023

Factual consistency is one of the most important requirements when editing high quality documents. It is extremely important for automatic text generation systems like summarization, question answering, dialog modeling, and language modeling. Still, automated factual inconsistency detection is rather under-studied. Existing work has focused on (a) finding fake news keeping a knowledge base in context, or (b) detecting broad contradiction (as part of natural language inference literature). However, there has been no work on detecting and explaining types of factual inconsistencies in text, without any knowledge base in context. In this paper, we leverage existing work in linguistics to formally define five types of factual inconsistencies. Based on this categorization, we contribute a novel dataset, FICLE (Factual Inconsistency CLassification with Explanation), with ~8K samples where each sample consists of two sentences (claim and context) annotated with type and span of inconsistency. When the inconsistency relates to an entity type, it is labeled as well at two levels (coarse and fine-grained). Further, we leverage this dataset to train a pipeline of four neural models to predict inconsistency type with explanations, given a (claim, context) sentence pair. Explanations include inconsistent claim fact triple, inconsistent context span, inconsistent claim component, coarse and fine-grained inconsistent entity types. The proposed system first predicts inconsistent spans from claim and context; and then uses them to predict inconsistency types and inconsistent entity types (when inconsistency is due to entities). We experiment with multiple Transformer-based natural language classification as well as generative models, and find that DeBERTa performs the best. Our proposed methods provide a weighted F1 of ~87% for inconsistency type classification across the five classes.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2306.08872

Country:

Asia > India (0.47)
North America > United States > Louisiana (0.14)

Genre: Research Report (0.82)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

LLM-RM at SemEval-2023 Task 2: Multilingual Complex NER using XLM-RoBERTa

Mehta, Rahul, Varma, Vasudeva

arXiv.org Artificial IntelligenceMay-5-2023

Named Entity Recognition(NER) is a task of recognizing entities at a token level in a sentence. This paper focuses on solving NER tasks in a multilingual setting for complex named entities. Our team, LLM-RM participated in the recently organized SemEval 2023 task, Task 2: MultiCoNER II,Multilingual Complex Named Entity Recognition. We approach the problem by leveraging cross-lingual representation provided by fine-tuning XLM-Roberta base model on datasets of all of the 12 languages provided -- Bangla, Chinese, English, Farsi, French, German, Hindi, Italian, Portuguese, Spanish, Swedish and Ukrainian

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2305.033

Country: Asia > India (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GrapeQA: GRaph Augmentation and Pruning to Enhance Question-Answering

Taunk, Dhaval, Khanna, Lakshya, Kandru, Pavan, Varma, Vasudeva, Sharma, Charu, Tapaswi, Makarand

arXiv.org Artificial IntelligenceApr-18-2023

Commonsense question-answering (QA) methods combine the power of pre-trained Language Models (LM) with the reasoning provided by Knowledge Graphs (KG). A typical approach collects nodes relevant to the QA pair from a KG to form a Working Graph (WG) followed by reasoning using Graph Neural Networks(GNNs). This faces two major challenges: (i) it is difficult to capture all the information from the QA in the WG, and (ii) the WG contains some irrelevant nodes from the KG. To address these, we propose GrapeQA with two simple improvements on the WG: (i) Prominent Entities for Graph Augmentation identifies relevant text chunks from the QA pair and augments the WG with corresponding latent representations from the LM, and (ii) Context-Aware Node Pruning removes nodes that are less relevant to the QA pair. We evaluate our results on OpenBookQA, CommonsenseQA and MedQA-USMLE and see that GrapeQA shows consistent improvements over its LM + KG predecessor (QA-GNN in particular) and large improvements on OpenBookQA.

natural language, node, question answering, (19 more...)

arXiv.org Artificial Intelligence

2303.1232

Country:

North America > United States (0.14)
Asia > India (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

XWikiGen: Cross-lingual Summarization for Encyclopedic Text Generation in Low Resource Languages

Taunk, Dhaval, Sagare, Shivprasad, Patil, Anupam, Subramanian, Shivansh, Gupta, Manish, Varma, Vasudeva

arXiv.org Artificial IntelligenceApr-18-2023

Lack of encyclopedic text contributors, especially on Wikipedia, makes automated text generation for low resource (LR) languages a critical problem. Existing work on Wikipedia text generation has focused on English only where English reference articles are summarized to generate English Wikipedia pages. But, for low-resource languages, the scarcity of reference articles makes monolingual summarization ineffective in solving this problem. Hence, in this work, we propose XWikiGen, which is the task of cross-lingual multi-document summarization of text from multiple reference articles, written in various languages, to generate Wikipedia-style text. Accordingly, we contribute a benchmark dataset, XWikiRef, spanning ~69K Wikipedia articles covering five domains and eight languages. We harness this dataset to train a two-stage system where the input is a set of citations and a section title and the output is a section-specific LR summary. The proposed system is based on a novel idea of neural unsupervised extractive summarization to coarsely identify salient information followed by a neural abstractive model to generate the section-specific text. Extensive experiments show that multi-domain training is better than the multi-lingual setup on average.

artificial intelligence, natural language, text processing, (16 more...)

arXiv.org Artificial Intelligence

2303.12308

Country:

North America > United States (0.14)
Europe > Spain (0.14)
Asia > India (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Tennis (0.93)
Media > Film (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback