AITopics | Rallabandi, SaiKrishna

Collaborating Authors

Rallabandi, SaiKrishna

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Jetsons at FinNLP 2024: Towards Understanding the ESG Impact of a News Article using Transformer-based Models

Dakle, Parag Pravin, Gon, Alolika, Zha, Sihan, Wang, Liang, Rallabandi, SaiKrishna, Raghavan, Preethi

arXiv.org Artificial IntelligenceMar-30-2024

In this paper, we describe the different approaches explored by the Jetsons team for the Multi-Lingual ESG Impact Duration Inference (ML-ESG-3) shared task. The shared task focuses on predicting the duration and type of the ESG impact of a news article. The shared task dataset consists of 2,059 news titles and articles in English, French, Korean, and Japanese languages. For the impact duration classification task, we fine-tuned XLM-RoBERTa with a custom fine-tuning strategy and using self-training and DeBERTa-v3 using only English translations. These models individually ranked first on the leaderboard for Korean and Japanese and in an ensemble for the English language, respectively. For the impact type classification task, our XLM-RoBERTa model fine-tuned using a custom fine-tuning strategy ranked first for the English language.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2404.00386

Country: Europe > Slovakia (0.14)

Genre: Research Report (0.65)

Industry:

Banking & Finance (0.69)
Energy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.34)

Add feedback

Towards leveraging LLMs for Conditional QA

Hussain, Syed-Amad, Dakle, Parag Pravin, Rallabandi, SaiKrishna, Raghavan, Preethi

arXiv.org Artificial IntelligenceDec-2-2023

This study delves into the capabilities and limitations of Large Language Models (LLMs) in the challenging domain of conditional question-answering. Utilizing the Conditional Question Answering (CQA) dataset and focusing on generative models like T5 and UL2, we assess the performance of LLMs across diverse question types. Our findings reveal that fine-tuned LLMs can surpass the state-of-the-art (SOTA) performance in some cases, even without fully encoding all input context, with an increase of 7-8 points in Exact Match (EM) and F1 scores for Yes/No questions. However, these models encounter challenges in extractive question answering, where they lag behind the SOTA by over 10 points, and in mitigating the risk of injecting false information. A study with oracle-retrievers emphasizes the critical role of effective evidence retrieval, underscoring the necessity for advanced solutions in this area. Furthermore, we highlight the significant influence of evaluation metrics on performance assessments and advocate for a more comprehensive evaluation framework. The complexity of the task, the observed performance discrepancies, and the need for effective evidence retrieval underline the ongoing challenges in this field and underscore the need for future work focusing on refining training tasks and exploring prompt-based techniques to enhance LLM performance in conditional question-answering tasks.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2312.01143

Country: North America > United States > Ohio (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.97)

Add feedback

HeySQuAD: A Spoken Question Answering Dataset

Wu, Yijing, Rallabandi, SaiKrishna, Srinivasamurthy, Ravisutha, Dakle, Parag Pravin, Gon, Alolika, Raghavan, Preethi

arXiv.org Artificial IntelligenceApr-26-2023

Human-spoken questions are critical to evaluating the performance of spoken question answering (SQA) systems that serve several real-world use cases including digital assistants. We present a new large-scale community-shared SQA dataset, HeySQuAD that consists of 76k human-spoken questions and 97k machine-generated questions and corresponding textual answers derived from the SQuAD QA dataset. The goal of HeySQuAD is to measure the ability of machines to understand noisy spoken questions and answer the questions accurately. To this end, we run extensive benchmarks on the human-spoken and machine-generated questions to quantify the differences in noise from both sources and its subsequent impact on the model and answering accuracy. Importantly, for the task of SQA, where we want to answer human-spoken questions, we observe that training using the transcribed human-spoken and original SQuAD questions leads to significant improvements (12.51%) over training using only the original SQuAD textual questions.

artificial intelligence, natural language, question answering, (19 more...)

arXiv.org Artificial Intelligence

2304.13689

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.73)

Add feedback

Understanding BLOOM: An empirical study on diverse NLP tasks

Dakle, Parag Pravin, Rallabandi, SaiKrishna, Raghavan, Preethi

arXiv.org Artificial IntelligenceMar-14-2023

We view the landscape of large language models (LLMs) through the lens of the recently released BLOOM model to understand the performance of BLOOM and other decoder-only LLMs compared to BERT-style encoder-only models. We achieve this by evaluating the smaller BLOOM model variants (\textit{350m/560m} and \textit{1b3/1b7}) on several NLP benchmark datasets and popular leaderboards. We make the following observations: (1) BLOOM performance does not scale with parameter size, unlike other LLMs like GPT and BERT. Experiments fine-tuning BLOOM models show that the 560m variant performs similarly to or better than the 1b7 variant, (2) Zero-shot cross-lingual and multi-lingual fine-tuning experiments show that BLOOM is at par or worse than monolingual GPT-2 models, and (3) Toxicity analysis of prompt-based text generation using the RealToxicityPrompts dataset shows that the text generated by BLOOM is at least 17\% less toxic than GPT-2 and GPT-3 models.

machine learning, natural language, variant, (18 more...)

arXiv.org Artificial Intelligence

2211.14865

Country:

Europe (1.00)
Asia (0.93)
North America > United States > California (0.46)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.93)
Banking & Finance > Credit (0.68)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Controlled DeEntanglement for Natural Language Processing

Rallabandi, SaiKrishna

arXiv.org Artificial IntelligenceSep-22-2019

Latest addition to the toolbox of human species is Artificial Intelligence(AI). Thus far, AI has made significant progress in low stake low risk scenarios such as playing Go and we are currently in a transition toward medium stake scenarios such as Visual Dialog. In my thesis, I argue that we need to incorporate controlled de-entanglement as first class object to succeed in this transition. I present mathematical analysis from information theory to show that employing stochasticity leads to controlled de-entanglement of relevant factors of variation at various levels. Based on this, I highlight results from initial experiments that depict efficacy of the proposed framework. I conclude this writeup by a roadmap of experiments that show the applicability of this framework to scalability, flexibility and interpretibility.

deep learning, information, neural network, (19 more...)

arXiv.org Artificial Intelligence

1909.09964

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback