AITopics | Lutz, Roman

Collaborating Authors

Lutz, Roman

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Lessons From Red Teaming 100 Generative AI Products

Bullwinkel, Blake, Minnich, Amanda, Chawla, Shiven, Lopez, Gary, Pouliot, Martin, Maxwell, Whitney, de Gruyter, Joris, Pratt, Katherine, Qi, Saphir, Chikanov, Nina, Lutz, Roman, Dheekonda, Raja Sekhar Rao, Jagdagdorj, Bolor-Erdene, Kim, Eugenia, Song, Justin, Hines, Keegan, Jones, Daniel, Severi, Giorgio, Lundeen, Richard, Vaughan, Sam, Westerhoff, Victoria, Bryan, Pete, Kumar, Ram Shankar Siva, Zunger, Yonatan, Kawaguchi, Chang, Russinovich, Mark

arXiv.org Artificial IntelligenceJan-13-2025

In recent years, AI red teaming has emerged as a practice for probing the safety and security of generative AI systems. Due to the nascency of the field, there are many open questions about how red teaming operations should be conducted. Based on our experience red teaming over 100 generative AI products at Microsoft, we present our internal threat model ontology and eight main lessons we have learned: 1. Understand what the system can do and where it is applied 2. You don't have to compute gradients to break an AI system 3. AI red teaming is not safety benchmarking 4. Automation can help cover more of the risk landscape 5. The human element of AI red teaming is crucial 6. Responsible AI harms are pervasive but difficult to measure 7. LLMs amplify existing security risks and introduce new ones 8. The work of securing AI systems will never be complete By sharing these insights alongside case studies from our operations, we offer practical recommendations aimed at aligning red teaming efforts with real world risks. We also highlight aspects of AI red teaming that we believe are often misunderstood and discuss open questions for the field to consider.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2501.07238

Country:

North America > United States (0.28)
Asia > Middle East > UAE (0.14)

Genre:

Overview (0.68)
Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (0.69)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.81)

Add feedback

PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System

Munoz, Gary D. Lopez, Minnich, Amanda J., Lutz, Roman, Lundeen, Richard, Dheekonda, Raja Sekhar Rao, Chikanov, Nina, Jagdagdorj, Bolor-Erdene, Pouliot, Martin, Chawla, Shiven, Maxwell, Whitney, Bullwinkel, Blake, Pratt, Katherine, de Gruyter, Joris, Siska, Charlotte, Bryan, Pete, Westerhoff, Tori, Kawaguchi, Chang, Seifert, Christian, Kumar, Ram Shankar Siva, Zunger, Yonatan

arXiv.org Artificial IntelligenceOct-1-2024

Generative Artificial Intelligence (GenAI) is becoming ubiquitous in our daily lives. The increase in computational power and data availability has led to a proliferation of both single- and multi-modal models. As the GenAI ecosystem matures, the need for extensible and model-agnostic risk identification frameworks is growing. To meet this need, we introduce the Python Risk Identification Toolkit (PyRIT), an open-source framework designed to enhance red teaming efforts in GenAI systems. PyRIT is a model- and platform-agnostic tool that enables red teamers to probe for and identify novel harms, risks, and jailbreaks in multimodal generative AI models. Its composable architecture facilitates the reuse of core building blocks and allows for extensibility to future models and modalities. This paper details the challenges specific to red teaming generative AI systems, the development and features of PyRIT, and its practical applications in real-world scenarios.

machine learning, natural language, password, (20 more...)

arXiv.org Artificial Intelligence

2410.02828

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

Magooda, Ahmed, Helyar, Alec, Jackson, Kyle, Sullivan, David, Atalla, Chad, Sheng, Emily, Vann, Dan, Edgar, Richard, Palangi, Hamid, Lutz, Roman, Kong, Hongliang, Yun, Vincent, Kamal, Eslam, Zarfati, Federico, Wallach, Hanna, Bird, Sarah, Chen, Mei

arXiv.org Artificial IntelligenceOct-26-2023

We present a framework for the automated measurement of responsible AI (RAI) metrics for large language models (LLMs) and associated products and services. Our framework for automatically measuring harms from LLMs builds on existing technical and sociotechnical expertise and leverages the capabilities of state-of-the-art LLMs, such as GPT-4. We use this framework to run through several case studies investigating how different LLMs may violate a range of RAI-related principles. The framework may be employed alongside domain-specific sociotechnical expertise to create measurements for new harm areas in the future. By implementing this framework, we aim to enable more advanced harm measurement efforts and further the responsible use of LLMs.

large language model, machine learning, natural language, (6 more...)

arXiv.org Artificial Intelligence

2310.1775

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Fairlearn: Assessing and Improving Fairness of AI Systems

Weerts, Hilde, Dudík, Miroslav, Edgar, Richard, Jalali, Adrin, Lutz, Roman, Madaio, Michael

arXiv.org Artificial IntelligenceMar-29-2023

Fairlearn is an open source project to help practitioners assess and improve fairness of artificial intelligence (AI) systems. The associated Python library, also named fairlearn, supports evaluation of a model's output across affected populations and includes several algorithms for mitigating fairness issues. Grounded in the understanding that fairness is a sociotechnical challenge, the project integrates learning resources that aid practitioners in considering a system's broader societal context.

artificial intelligence, fairness, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2303.16626

Country:

Europe (0.46)
North America > United States (0.14)

Genre: Research Report (0.41)

Industry:

Law (0.69)
Government > Regional Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback