AITopics | Bavalatti, Trupti

Collaborating Authors

Bavalatti, Trupti

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Systematic Review of Open Datasets Used in Text-to-Image (T2I) Gen AI Model Safety

Rouf, Rakeen, Bavalatti, Trupti, Ahmed, Osama, Potdar, Dhaval, Jawed, Faraz

arXiv.org Artificial IntelligenceFeb-22-2025

This work is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). For the definitive version, see 10.1109/ACCESS.2025.3539933. Disclaimer: This research involves topics that may include disturbing results. Any explicit content has been redacted, and potentially disturbing results have been presented in a neutral and anonymized manner to minimize emotional distress to the readers. Abstract --Novel research aimed at text-to-image (T2I) generative AI safety often relies on publicly available datasets for training and evaluation, making the quality and composition of these datasets crucial. This paper presents a comprehensive review of the key datasets used in the T2I research, detailing their collection methods, compositions, semantic and syntactic diversity of prompts and the quality, coverage, and distribution of harm types in the datasets. By highlighting the strengths and limitations of the datasets, this study enables researchers to find the most ...

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ACCESS.2025.3539933

2503.0002

Country:

Asia (0.67)
North America > United States > North Carolina (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology (0.87)
Health & Medicine (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.93)
(2 more...)

Add feedback

Class-RAG: Real-Time Content Moderation with Retrieval Augmented Generation

Chen, Jianfa, Shen, Emily, Bavalatti, Trupti, Lin, Xiaowen, Wang, Yongkai, Hu, Shuming, Subramanyam, Harihar, Vepuri, Ksheeraj Sai, Jiang, Ming, Qi, Ji, Chen, Li, Jiang, Nan, Jain, Ankit

arXiv.org Artificial IntelligenceDec-17-2024

Recent advances in Generative AI technology have enabled new generations of product applications, such as text generation OpenAI (2023); Anthropic (2023); Dubey (2024), text-to-image generation Ramesh et al. (2021); Dai et al. (2023); Rombach et al. (2022), and text-to-video generation Meta (2024). Consequently, the pace of model development must be matched by the development of safety systems which are properly equipped to mitigate novel harms, ensuring the system's overall integrity and preventing the use of Generative AI products from being exploited by bad actors to disseminate misinformation, glorify violence, and proliferate sexual content Foundation (2023). To achieve this goal, traditional model fine-tuning approaches are often employed, with classifiers learning patterns from labeled content moderation text data leveraged as guardrails OpenAI (2023). However, there are many challenges associated with automating content moderation with fine-tuning. First, content moderation is a highly subjective task, meaning that inter-annotator agreement in labeled data is low, due to different interpretations of policy guidelines, especially on borderline cases Markov et al. (2023). Second, it is impossible to enforce a universal taxonomy of harm, not only due to the subjectivity of the task, but due to the impact of systems scaling to new locales, new audiences, and new use cases, with different guidelines and different gradients of harm defined on those guidelines Shen et al. (2024). Third, the fine-tuning development cycle, which encompasses data collection, annotation, and model experimentation, is not ideally suited to the content moderation domain, where mitigations must land as quickly as possible once vulnerabilities are established. To address these challenges of subjectivity and inflexibility as a result of scale, we propose a Classification approach to content moderation which employs Retrieval-Augmented Generation (Class-RAG) to add context to elicit reasoning for content classification. While RAG Lewis et al. (2020) is often used for knowledge-intensive tasks where factual citation is key, we find that a RAG-based solution offers a distinct value proposition for the classification task of content moderation, not only due to its ability to enhance accuracy with few-shot learning, but because of its ability to make real-time knowledge updates, which is critical in our domain for

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.14881

Country:

North America (0.28)
Asia (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Vidgen, Bertie, Agrawal, Adarsh, Ahmed, Ahmed M., Akinwande, Victor, Al-Nuaimi, Namir, Alfaraj, Najla, Alhajjar, Elie, Aroyo, Lora, Bavalatti, Trupti, Bartolo, Max, Blili-Hamelin, Borhane, Bollacker, Kurt, Bomassani, Rishi, Boston, Marisa Ferrara, Campos, Siméon, Chakra, Kal, Chen, Canyu, Coleman, Cody, Coudert, Zacharie Delpierre, Derczynski, Leon, Dutta, Debojyoti, Eisenberg, Ian, Ezick, James, Frase, Heather, Fuller, Brian, Gandikota, Ram, Gangavarapu, Agasthya, Gangavarapu, Ananya, Gealy, James, Ghosh, Rajat, Goel, James, Gohar, Usman, Goswami, Sujata, Hale, Scott A., Hutiri, Wiebke, Imperial, Joseph Marvin, Jandial, Surgan, Judd, Nick, Juefei-Xu, Felix, Khomh, Foutse, Kailkhura, Bhavya, Kirk, Hannah Rose, Klyman, Kevin, Knotz, Chris, Kuchnik, Michael, Kumar, Shachi H., Kumar, Srijan, Lengerich, Chris, Li, Bo, Liao, Zeyi, Long, Eileen Peters, Lu, Victor, Luger, Sarah, Mai, Yifan, Mammen, Priyanka Mary, Manyeki, Kelvin, McGregor, Sean, Mehta, Virendra, Mohammed, Shafee, Moss, Emanuel, Nachman, Lama, Naganna, Dinesh Jinenhally, Nikanjam, Amin, Nushi, Besmira, Oala, Luis, Orr, Iftach, Parrish, Alicia, Patlak, Cigdem, Pietri, William, Poursabzi-Sangdeh, Forough, Presani, Eleonora, Puletti, Fabrizio, Röttger, Paul, Sahay, Saurav, Santos, Tim, Scherrer, Nino, Sebag, Alice Schoenauer, Schramowski, Patrick, Shahbazi, Abolfazl, Sharma, Vin, Shen, Xudong, Sistla, Vamsi, Tang, Leonard, Testuggine, Davide, Thangarasa, Vithursan, Watkins, Elizabeth Anne, Weiss, Rebecca, Welty, Chris, Wilbers, Tyler, Williams, Adina, Wu, Carole-Jean, Yadav, Poonam, Yang, Xianjun, Zeng, Yi, Zhang, Wenhui, Zhdanov, Fedor, Zhu, Jiacheng, Liang, Percy, Mattson, Peter, Vanschoren, Joaquin

arXiv.org Artificial IntelligenceMay-13-2024

This paper introduces v0.5 of the AI Safety Benchmark, which has been created by the MLCommons AI Safety Working Group. The AI Safety Benchmark has been designed to assess the safety risks of AI systems that use chat-tuned language models. We introduce a principled approach to specifying and constructing the benchmark, which for v0.5 covers only a single use case (an adult chatting to a general-purpose assistant in English), and a limited set of personas (i.e., typical users, malicious users, and vulnerable users). We created a new taxonomy of 13 hazard categories, of which 7 have tests in the v0.5 benchmark. We plan to release version 1.0 of the AI Safety Benchmark by the end of 2024. The v1.0 benchmark will provide meaningful insights into the safety of AI systems. However, the v0.5 benchmark should not be used to assess the safety of AI systems. We have sought to fully document the limitations, flaws, and challenges of v0.5. This release of v0.5 of the AI Safety Benchmark includes (1) a principled approach to specifying and constructing the benchmark, which comprises use cases, types of systems under test (SUTs), language and context, personas, tests, and test items; (2) a taxonomy of 13 hazard categories with definitions and subcategories; (3) tests for seven of the hazard categories, each comprising a unique set of test items, i.e., prompts. There are 43,090 test items in total, which we created with templates; (4) a grading system for AI systems against the benchmark; (5) an openly available platform, and downloadable tool, called ModelBench that can be used to evaluate the safety of AI systems on the benchmark; (6) an example evaluation report which benchmarks the performance of over a dozen openly available chat-tuned language models; (7) a test specification for the benchmark.

benchmark, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.12241

Country:

Europe > United Kingdom (1.00)
Asia > Middle East (0.92)
North America > United States > California (0.27)

Genre:

Personal > Interview (0.46)
Research Report > New Finding (0.45)

Industry:

Media (1.00)
Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Terrorism (1.00)
(10 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback