AITopics | Ramesh, Krithika

Collaborating Authors

Ramesh, Krithika

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Evaluating Differentially Private Synthetic Data Generation in High-Stakes Domains

Ramesh, Krithika, Gandhi, Nupoor, Madaan, Pulkit, Bauer, Lisa, Peris, Charith, Field, Anjalie

arXiv.org Artificial IntelligenceOct-10-2024

The difficulty of anonymizing text data hinders the development and deployment of NLP in high-stakes domains that involve private data, such as healthcare and social services. Poorly anonymized sensitive data cannot be easily shared with annotators or external researchers, nor can it be used to train public models. In this work, we explore the feasibility of using synthetic data generated from differentially private language models in place of real data to facilitate the development of NLP in these domains without compromising privacy. In contrast to prior work, we generate synthetic data for real high-stakes domains, and we propose and conduct use-inspired evaluations to assess data quality. Our results show that prior simplistic evaluations have failed to highlight utility, privacy, and fairness issues in the synthetic data. Overall, our work underscores the need for further improvements to synthetic data generation for it to be a viable way to enable privacy-preserving data sharing.

computational linguistic, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.08327

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

MEGA: Multilingual Evaluation of Generative AI

Ahuja, Kabir, Diddee, Harshita, Hada, Rishav, Ochieng, Millicent, Ramesh, Krithika, Jain, Prachi, Nambi, Akshay, Ganu, Tanuja, Segal, Sameer, Axmed, Maxamed, Bali, Kalika, Sitaram, Sunayana

arXiv.org Artificial IntelligenceOct-22-2023

Generative AI models have shown impressive performance on many Natural Language Processing tasks such as language understanding, reasoning, and language generation. An important question being asked by the AI community today is about the capabilities and limits of these models, and it is clear that evaluating generative AI is very challenging. Most studies on generative LLMs have been restricted to English and it is unclear how capable these models are at understanding and generating text in other languages. We present the first comprehensive benchmarking of generative LLMs - MEGA, which evaluates models on standard NLP benchmarks, covering 16 NLP datasets across 70 typologically diverse languages. We compare the performance of generative LLMs including Chat-GPT and GPT-4 to State of the Art (SOTA) non-autoregressive models on these tasks to determine how well generative models perform compared to the previous generation of LLMs. We present a thorough analysis of the performance of models across languages and tasks and discuss challenges in improving the performance of generative LLMs on low-resource languages. We create a framework for evaluating generative LLMs in the multilingual setting and provide directions for future progress in the field.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2303.12528

Country:

Asia > Middle East > UAE (0.14)
North America > United States > Louisiana (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.90)

Add feedback

Fairness in Language Models Beyond English: Gaps and Challenges

Ramesh, Krithika, Sitaram, Sunayana, Choudhury, Monojit

arXiv.org Artificial IntelligenceFeb-28-2023

With language models becoming increasingly ubiquitous, it has become essential to address their inequitable treatment of diverse demographic groups and factors. Most research on evaluating and mitigating fairness harms has been concentrated on English, while multilingual models and non-English languages have received comparatively little attention. This paper presents a survey of fairness in multilingual and non-English contexts, highlighting the shortcomings of current research and the difficulties faced by methods designed for English. We contend that the multitude of diverse cultures and languages across the world makes it infeasible to achieve comprehensive coverage in terms of constructing fairness datasets. Thus, the measurement and mitigation of biases must evolve beyond the current dataset-driven practices that are narrowly focused on specific dimensions and types of biases and, therefore, impossible to scale across languages and cultures.

artificial intelligence, computational linguistic, natural language, (20 more...)

arXiv.org Artificial Intelligence

2302.12578

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East > UAE (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Curb Your Carbon Emissions: Benchmarking Carbon Emissions in Machine Translation

Yusuf, Mirza, Surana, Praatibh, Gupta, Gauri, Ramesh, Krithika

arXiv.org Artificial IntelligenceSep-26-2021

Although our computational techniques and hardware resources have advanced greatly these past few decades, given the rise of large language models which have applications in multiple sectors, the environmental impact of training and developing NLP models, particularly on a large scale, could have detrimental consequences on the environment. This is due to the energy usage (whether carbon neutral or not) [1, 2] possibly contributing directly or indirectly to the effects of climate change. With experiments on total time expected for models such as Transformer, BERT, and GPT-2 to train and the subsequent cost of training, Strubell et al. [2] provides substantial evidence that researchers need to increasingly prioritize computationally efficient hardware and algorithms. There has been research to suggest that large language models could be outperformed by their less computationally intensive counterparts on multiple tasks with the help of fine-tuning [3] and techniques such as using random search for hyperparameter search [1, 4-6] or pruning [7, 8]. Additionally, as performance across different tasks tends to vary based on the languages used, data availability, model architectures among other factors, it is likely that training models to a certain performance level for some languages are less carbon-intensive than others. This is speculation is substantiated by the correlation found between morphological ambiguity of languages and the performance of language models on European languages [9]. The primary objective of our work is to measure the differences in carbon emissions released between multiple language pairs and assess the contributions of various components, within the two architectures we've used, to the carbon We are grateful to the Research Society MIT, Manipal for supporting this work, and we attribute equal contribution to all the authors of this paper.

deep learning, language pair, neural network, (19 more...)

arXiv.org Artificial Intelligence

2109.12584

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (1.00)
Energy > Coal (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating Gender Bias in Hindi-English Machine Translation

Gupta, Gauri, Ramesh, Krithika, Singh, Sanjay

arXiv.org Artificial IntelligenceJun-16-2021

With language models being deployed increasingly in the real world, it is essential to address the issue of the fairness of their outputs. The word embedding representations of these language models often implicitly draw unwanted associations that form a social bias within the model. The nature of gendered languages like Hindi, poses an additional problem to the quantification and mitigation of bias, owing to the change in the form of the words in the sentence, based on the gender of the subject. Additionally, there is sparse work done in the realm of measuring and debiasing systems for Indic languages. In our work, we attempt to evaluate and quantify the gender bias within a Hindi-English machine translation system. We implement a modified version of the existing TGBI metric based on the grammatical considerations for Hindi. We also compare and contrast the resulting bias measurements across multiple metrics for pre-trained embeddings and the ones learned by our machine translation model.

artificial intelligence, machine translation, proceedings, (15 more...)

arXiv.org Artificial Intelligence

2106.0868

Country:

Asia (0.93)
Europe (0.69)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback