Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Dalrymple, David "davidad", Skalse, Joar, Bengio, Yoshua, Russell, Stuart, Tegmark, Max, Seshia, Sanjit, Omohundro, Steve, Szegedy, Christian, Goldhaber, Ben, Ammann, Nora, Abate, Alessandro, Halpern, Joe, Barrett, Clark, Zhao, Ding, Zhi-Xuan, Tan, Wing, Jeannette, Tenenbaum, Joshua
–arXiv.org Artificial Intelligence
We introduce and define a family of approaches to AI safety, collectively referred to as guaranteed safe (GS) AI. These Ensuring that AI systems reliably and robustly approaches aim to provide high-assurance quantitative guarantees avoid harmful or dangerous behaviours is a crucial about the safety of an AI system's behaviour through challenge, especially for AI systems with a the use of three core components -- a formal safety specification, high degree of autonomy and general intelligence, a world model, and a verifier. We will argue that this or systems used in safety-critical contexts. In strategy is both promising and underexplored, and contrast it this position paper, we will introduce and define with other ongoing efforts in AI safety. We will also outline a family of approaches to AI safety, which we several ongoing avenues of research within the broader GS will refer to as guaranteed safe (GS) AI. The core research agenda, identify some of their core difficulties, and feature of these approaches is that they aim to produce discuss approaches for overcoming these difficulties. Central AI systems which are equipped with highassurance examples of agendas which fall under the GS AI family quantitative safety guarantees. This include Szegedy (2020); Wing (2021); Seshia et al. (2022); is achieved by the interplay of three core components: Russell (2022); Tegmark & Omohundro (2023); 'davidad' a world model (which provides a mathematical Dalrymple (2024); Bengio (2024).
arXiv.org Artificial Intelligence
Jul-8-2024
- Country:
- North America
- United States
- Massachusetts (0.04)
- District of Columbia > Washington (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- Texas > Travis County
- Austin (0.04)
- New York > New York County
- New York City (0.04)
- California
- Santa Clara County > Stanford (0.04)
- Los Angeles County > Pasadena (0.04)
- Canada > Quebec
- Montreal (0.04)
- United States
- Europe
- Monaco (0.04)
- Italy (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Oxfordshire > Oxford (0.04)
- Asia > Middle East
- Israel > Haifa District > Haifa (0.04)
- North America
- Genre:
- Overview (1.00)
- Research Report (0.82)
- Industry:
- Transportation (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- Government (1.00)
- Education (0.92)
- Energy > Power Industry (0.67)
- Technology:
- Information Technology > Artificial Intelligence
- Natural Language > Large Language Model (0.93)
- Robots > Autonomous Vehicles (0.93)
- Issues > Social & Ethical Issues (0.92)
- Cognitive Science > Problem Solving (0.72)
- Representation & Reasoning
- Uncertainty (1.00)
- Logic & Formal Reasoning (1.00)
- Machine Learning
- Neural Networks > Deep Learning (1.00)
- Learning Graphical Models (0.67)
- Information Technology > Artificial Intelligence