seshia
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems
Dalrymple, David "davidad", Skalse, Joar, Bengio, Yoshua, Russell, Stuart, Tegmark, Max, Seshia, Sanjit, Omohundro, Steve, Szegedy, Christian, Goldhaber, Ben, Ammann, Nora, Abate, Alessandro, Halpern, Joe, Barrett, Clark, Zhao, Ding, Zhi-Xuan, Tan, Wing, Jeannette, Tenenbaum, Joshua
We introduce and define a family of approaches to AI safety, collectively referred to as guaranteed safe (GS) AI. These Ensuring that AI systems reliably and robustly approaches aim to provide high-assurance quantitative guarantees avoid harmful or dangerous behaviours is a crucial about the safety of an AI system's behaviour through challenge, especially for AI systems with a the use of three core components -- a formal safety specification, high degree of autonomy and general intelligence, a world model, and a verifier. We will argue that this or systems used in safety-critical contexts. In strategy is both promising and underexplored, and contrast it this position paper, we will introduce and define with other ongoing efforts in AI safety. We will also outline a family of approaches to AI safety, which we several ongoing avenues of research within the broader GS will refer to as guaranteed safe (GS) AI. The core research agenda, identify some of their core difficulties, and feature of these approaches is that they aim to produce discuss approaches for overcoming these difficulties. Central AI systems which are equipped with highassurance examples of agendas which fall under the GS AI family quantitative safety guarantees. This include Szegedy (2020); Wing (2021); Seshia et al. (2022); is achieved by the interplay of three core components: Russell (2022); Tegmark & Omohundro (2023); 'davidad' a world model (which provides a mathematical Dalrymple (2024); Bengio (2024).
Toward Verified Artificial Intelligence
Techniques for automatically generating abstractions of systems have been the linchpins of formal methods, playing crucial roles in extending the reach of formal methods to large hardware and software systems. To address the challenges of very high-dimensional hybrid-state spaces and input spaces for ML-based systems, we need to develop effective techniques to abstract ML models into simpler models that are more amenable to formal analysis. Some promising directions include using abstract interpretation to analyze DNNs (for example, Gehr et al.12), developing abstractions for falsifying cyber-physical systems with ML components,5 and devising novel representations for verification (for instance, star sets and other examples36).
Explorations in Cyber-Physical Systems Education
The field of CPS draws from several areas in computer science, electrical engineering, and other engineering disciplines, including computer architecture, embedded systems, programming languages, software engineering, real-time systems, operating systems and networking, formal methods, algorithms, computation theory, control theory, signal processing, robotics, sensors and actuators, and computer security. Similarly, over the past 14 years, we have had students from computer science, electrical and computer engineering, mechanical engineering, civil engineering, and even bioengineering. Integrating this bewildering diversity of subject areas into a coherent whole for students with such a wide breadth of backgrounds has been a challenge we had to overcome. One approach would have been to not attempt such an integration. Instead, we could have opted for a collection of courses that together cover all the key areas in CPS.
Towards Verified Artificial Intelligence
Seshia, Sanjit A., Sadigh, Dorsa, Sastry, S. Shankar
Artificial intelligence (AI) is a term used for computational systems that attempt to mimic aspects of human intelligence (e.g., see [17]). Russell and Norvig [56] describe AI as the study of general principles of rational agents and components for constructing these agents. More broadly, the field of AI involves building intelligent entities that mimic'cognitive' functions we intuitively associate with human minds, such as'learning' and'problem solving.' We interpret the term AI broadly to include closely-related areas such as machine learning [43]. Systems that heavily use AI, henceforth referred to as AIbased systems, have had a significant impact in society in domains that include healthcare, transportation, social networking, e-commerce, education, etc.