AITopics | Miehling, Erik

Collaborating Authors

Miehling, Erik

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Agentic AI Needs a Systems Theory

Miehling, Erik, Ramamurthy, Karthikeyan Natesan, Varshney, Kush R., Riemer, Matthew, Bouneffouf, Djallel, Richards, John T., Dhurandhar, Amit, Daly, Elizabeth M., Hind, Michael, Sattigeri, Prasanna, Wei, Dennis, Rawat, Ambrish, Gajcin, Jasmina, Geyer, Werner

arXiv.org Artificial IntelligenceFeb-28-2025

The endowment of AI with reasoning capabilities and some degree of agency is widely viewed as a path toward more capable and generalizable systems. Our position is that the current development of agentic AI requires a more holistic, systems-theoretic perspective in order to fully understand their capabilities and mitigate any emergent risks. The primary motivation for our position is that AI development is currently overly focused on individual model capabilities, often ignoring broader emergent behavior, leading to a significant underestimation in the true capabilities and associated risks of agentic AI. We describe some fundamental mechanisms by which advanced capabilities can emerge from (comparably simpler) agents simply due to their interaction with the environment and other agents. Informed by an extensive amount of existing literature from various fields, we outline mechanisms for enhanced agent cognition, emergent causal reasoning ability, and metacognitive awareness. We conclude by presenting some key open challenges and guidance for the development of agentic AI. We emphasize that a systems-level perspective is essential for better understanding, and purposefully shaping, agentic AI systems.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.00237

Country:

North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

Add feedback

Granite Guardian

Padhi, Inkit, Nagireddy, Manish, Cornacchia, Giandomenico, Chaudhury, Subhajit, Pedapati, Tejaswini, Dognin, Pierre, Murugesan, Keerthiram, Miehling, Erik, Cooper, Martín Santillán, Fraser, Kieran, Zizzo, Giulio, Hameed, Muhammad Zaid, Purcell, Mark, Desmond, Michael, Pan, Qian, Ashktorab, Zahra, Vejsbjerg, Inge, Daly, Elizabeth M., Hind, Michael, Geyer, Werner, Rawat, Ambrish, Varshney, Kush R., Sattigeri, Prasanna

arXiv.org Artificial IntelligenceDec-16-2024

We introduce the Granite Guardian models, a suite of safeguards designed to provide risk detection for prompts and responses, enabling safe and responsible use in combination with any large language model (LLM). These models offer comprehensive coverage across multiple risk dimensions, including social bias, profanity, violence, sexual content, unethical behavior, jailbreaking, and hallucination-related risks such as context relevance, groundedness, and answer relevance for retrieval-augmented generation (RAG). Trained on a unique dataset combining human annotations from diverse sources and synthetic data, Granite Guardian models address risks typically overlooked by traditional risk detection models, such as jailbreaks and RAG-specific issues. With AUC scores of 0.871 and 0.854 on harmful content and RAG-hallucination-related benchmarks respectively, Granite Guardian is the most generalizable and competitive model available in the space. Released as open-source, Granite Guardian aims to promote responsible AI development across the community.

granite guardian, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2412.07724

Country:

Europe (0.93)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.82)

Industry:

Media (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluating the Prompt Steerability of Large Language Models

Miehling, Erik, Desmond, Michael, Ramamurthy, Karthikeyan Natesan, Daly, Elizabeth M., Dognin, Pierre, Rios, Jesus, Bouneffouf, Djallel, Liu, Miao

arXiv.org Artificial IntelligenceNov-19-2024

Building pluralistic AI requires designing models that are able to be shaped to represent a wide range of value systems and cultures. Achieving this requires first being able to evaluate the degree to which a given model is capable of reflecting various personas. To this end, we propose a benchmark for evaluating the steerability of model personas as a function of prompting. Our design is based on a formal definition of prompt steerability, which analyzes the degree to which a model's joint behavioral distribution can be shifted from its baseline behavior. By defining steerability indices and inspecting how these indices change as a function of steering effort, we can estimate the steerability of a model across various persona dimensions and directions. Our benchmark reveals that the steerability of many current models is limited -- due to both a skew in their baseline behavior and an asymmetry in their steerability across many persona dimensions. We release an implementation of our benchmark at https://github.com/IBM/prompt-steering.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2411.12405

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Attack Atlas: A Practitioner's Perspective on Challenges and Pitfalls in Red Teaming GenAI

Rawat, Ambrish, Schoepf, Stefan, Zizzo, Giulio, Cornacchia, Giandomenico, Hameed, Muhammad Zaid, Fraser, Kieran, Miehling, Erik, Buesser, Beat, Daly, Elizabeth M., Purcell, Mark, Sattigeri, Prasanna, Chen, Pin-Yu, Varshney, Kush R.

arXiv.org Artificial IntelligenceSep-23-2024

As generative AI, particularly large language models (LLMs), become increasingly integrated into production applications, new attack surfaces and vulnerabilities emerge and put a focus on adversarial threats in natural language and multi-modal systems. Red-teaming has gained importance in proactively identifying weaknesses in these systems, while blue-teaming works to protect against such adversarial attacks. Despite growing academic interest in adversarial risks for generative AI, there is limited guidance tailored for practitioners to assess and mitigate these challenges in real-world environments. To address this, our contributions include: (1) a practical examination of red- and blue-teaming strategies for securing generative AI, (2) identification of key challenges and open questions in defense development and evaluation, and (3) the Attack Atlas, an intuitive framework that brings a practical approach to analyzing single-turn input attacks, placing it at the forefront for practitioners. This work aims to bridge the gap between academic insights and practical security measures for the protection of generative AI systems.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2409.15398

Country: North America > Mexico > Mexico City (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.86)

Add feedback

Language Models in Dialogue: Conversational Maxims for Human-AI Interactions

Miehling, Erik, Nagireddy, Manish, Sattigeri, Prasanna, Daly, Elizabeth M., Piorkowski, David, Richards, John T.

arXiv.org Artificial IntelligenceJun-22-2024

Modern language models, while sophisticated, exhibit some inherent shortcomings, particularly in conversational settings. We claim that many of the observed shortcomings can be attributed to violation of one or more conversational principles. By drawing upon extensive research from both the social science and AI communities, we propose a set of maxims -- quantity, quality, relevance, manner, benevolence, and transparency -- for describing effective human-AI conversation. We first justify the applicability of the first four maxims (from Grice) in the context of human-AI interactions. We then argue that two new maxims, benevolence (concerning the generation of, and engagement with, harmful content) and transparency (concerning recognition of one's knowledge boundaries, operational constraints, and intents), are necessary for addressing behavior unique to modern human-AI interactions. We evaluate the degree to which various language models are able to understand these maxims and find that models possess an internal prioritization of principles that can significantly impact their ability to interpret the maxims accurately.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2403.15115

Country: Europe > Germany (0.14)

Genre:

Research Report (1.00)
Personal > Interview (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CELL your Model: Contrastive Explanation Methods for Large Language Models

Luss, Ronny, Miehling, Erik, Dhurandhar, Amit

arXiv.org Artificial IntelligenceJun-17-2024

The advent of black-box deep neural network classification models has sparked the need to explain their decisions. However, in the case of generative AI such as large language models (LLMs), there is no class prediction to explain. Rather, one can ask why an LLM output a particular response to a given prompt. In this paper, we answer this question by proposing, to the best of our knowledge, the first contrastive explanation methods requiring simply black-box/query access. Our explanations suggest that an LLM outputs a reply to a given prompt because if the prompt was slightly modified, the LLM would have given a different response that is either less preferable or contradicts the original response. The key insight is that contrastive explanations simply require a distance function that has meaning to the user and not necessarily a real valued representation of a specific response (viz. class label). We offer two algorithms for finding contrastive explanations: i) A myopic algorithm, which although effective in creating contrasts, requires many model calls and ii) A budgeted algorithm, our main algorithmic contribution, which intelligently creates contrasts adhering to a query budget, necessary for longer contexts. We show the efficacy of these methods on diverse natural language tasks such as open-text generation, automated red teaming, and explaining conversational degradation.

explanation, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2406.11785

Country: North America > United States (0.28)

Genre: Research Report (0.69)

Industry:

Law (1.00)
Health & Medicine (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

Achintalwar, Swapnaja, Garcia, Adriana Alvarado, Anaby-Tavor, Ateret, Baldini, Ioana, Berger, Sara E., Bhattacharjee, Bishwaranjan, Bouneffouf, Djallel, Chaudhury, Subhajit, Chen, Pin-Yu, Chiazor, Lamogha, Daly, Elizabeth M., DB, Kirushikesh, de Paula, Rogério Abreu, Dognin, Pierre, Farchi, Eitan, Ghosh, Soumya, Hind, Michael, Horesh, Raya, Kour, George, Lee, Ja Young, Madaan, Nishtha, Mehta, Sameep, Miehling, Erik, Murugesan, Keerthiram, Nagireddy, Manish, Padhi, Inkit, Piorkowski, David, Rawat, Ambrish, Raz, Orna, Sattigeri, Prasanna, Strobelt, Hendrik, Swaminathan, Sarathkrishna, Tillmann, Christoph, Trivedi, Aashka, Varshney, Kush R., Wei, Dennis, Witherspooon, Shalisha, Zalmanovici, Marcel

arXiv.org Artificial IntelligenceJun-13-2024

Large language models (LLMs) are susceptible to a variety of risks, from non-faithful output to biased and toxic generations. Due to several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible to impose direct safety constraints on a deployed model. Therefore, an efficient and reliable alternative is required. To this end, we present our ongoing efforts to create and deploy a library of detectors: compact and easy-to-build classification models that provide labels for various harms. In addition to the detectors themselves, we discuss a wide range of uses for these detector models - from acting as guardrails to enabling effective AI governance. We also deep dive into inherent challenges in their development and discuss future work aimed at making the detectors more reliable and broadening their scope.

detector, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2403.06009

Country:

Asia (1.00)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(2 more...)

Genre:

Overview (0.46)
Research Report (0.40)

Industry:

Information Technology (0.69)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Online Planning for Decentralized Stochastic Control with Partial History Sharing

Zhang, Kaiqing, Miehling, Erik, Başar, Tamer

arXiv.org Artificial IntelligenceAug-6-2019

Computational challenges are further compounded if agents do not possess complete model knowledge. In this paper, we take advantage of the fact that in many problems agents share some common information, or history, termed partial history sharing . Under this information structure the policy search space is greatly reduced. We propose a provably convergent, online tree-search based algorithm that does not require a closed-form model or explicit communication among agents. Interestingly, our algorithm can be viewed as a generalization of several existing heuristic solvers for decentralized partially observable Markov decision processes. T o demonstrate the applicability of the model, we propose a novel collaborative intrusion response model, where multiple agents (defenders) possessing asymmetric information aim to collaboratively defend a computer network. Numerical results demonstrate the performance of our algorithm.

artificial intelligence, information, machine learning, (15 more...)

arXiv.org Artificial Intelligence

1908.02357

Country: North America > United States (0.14)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback