AITopics | Barrett, Clark

Plotting

Barrett, Clark

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Proof-Driven Clause Learning in Neural Network Verification

Isac, Omri, Refaeli, Idan, Wu, Haoze, Barrett, Clark, Katz, Guy

arXiv.org Artificial IntelligenceMar-15-2025

The widespread adoption of deep neural networks (DNNs) requires efficient techniques for safety verification. Existing methods struggle to scale to real-world DNNs, and tremendous efforts are being put into improving their scalability. In this work, we propose an approach for improving the scalability of DNN verifiers using Conflict-Driven Clause Learning (CDCL) -- an approach that has proven highly successful in SAT and SMT solving. We present a novel algorithm for deriving conflict clauses using UNSAT proofs, and propose several optimizations for expediting it. Our approach allows a modular integration of SAT solvers and DNN verifiers, and we implement it on top of an interface designed for this purpose. The evaluation of our implementation over several benchmarks suggests a 2X--3X improvement over a similar approach, with specific cases outperforming the state of the art.

artificial intelligence, machine learning, query, (17 more...)

arXiv.org Artificial Intelligence

2503.12083

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Transportation (0.68)
Information Technology > Security & Privacy (0.46)
Government > Regional Government (0.46)
Information Technology > Robotics & Automation (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pantograph: A Machine-to-Machine Interaction Interface for Advanced Theorem Proving, High Level Reasoning, and Data Extraction in Lean 4

Aniva, Leni, Sun, Chuyue, Miranda, Brando, Barrett, Clark, Koyejo, Sanmi

arXiv.org Artificial IntelligenceOct-21-2024

Machine-assisted theorem proving refers to the process of conducting structured reasoning to automatically generate proofs for mathematical theorems. Recently, there has been a surge of interest in using machine learning models in conjunction with proof assistants to perform this task. In this paper, we introduce Pantograph, a tool that provides a versatile interface to the Lean 4 proof assistant and enables efficient proof search via powerful search algorithms such as Monte Carlo Tree Search. In addition, Pantograph enables high-level reasoning by enabling a more robust handling of Lean 4's inference steps. We provide an overview of Pantograph's architecture and features. We also report on an illustrative use case: using machine learning models and proof sketches to prove Lean 4 theorems. Pantograph's innovative features pave the way for more advanced machine learning models to perform complex proof searches and high-level reasoning, equipping future researchers to design more versatile and powerful theorem provers.

artificial intelligence, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2410.16429

Country: North America > United States (0.28)

Genre:

Overview (0.68)
Research Report (0.50)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.89)

Add feedback

Safe and Reliable Training of Learning-Based Aerospace Controllers

Mandal, Udayan, Amir, Guy, Wu, Haoze, Daukantas, Ieva, Newell, Fletcher Lee, Ravaioli, Umberto, Meng, Baoluo, Durling, Michael, Hobbs, Kerianne, Ganai, Milan, Shim, Tobey, Katz, Guy, Barrett, Clark

arXiv.org Artificial IntelligenceJul-9-2024

In recent years, deep reinforcement learning (DRL) approaches have generated highly successful controllers for a myriad of complex domains. However, the opaque nature of these models limits their applicability in aerospace systems and safety-critical domains, in which a single mistake can have dire consequences. In this paper, we present novel advancements in both the training and verification of DRL controllers, which can help ensure their safe behavior. We showcase a design-for-verification approach utilizing k-induction and demonstrate its use in verifying liveness properties. In addition, we also give a brief overview of neural Lyapunov Barrier certificates and summarize their capabilities on a case study. Finally, we describe several other novel reachability-based approaches which, despite failing to provide guarantees of interest, could be effective for verification of other DRL systems, and could be of further interest to the community.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2407.07088

Country:

North America > United States (0.47)
Asia > Middle East > Israel (0.28)

Genre: Research Report (0.82)

Industry:

Aerospace & Defense (1.00)
Transportation > Air (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)

Add feedback

Markovian Agents for Informative Language Modeling

Viteri, Scott, Lamparth, Max, Chatain, Peter, Barrett, Clark

arXiv.org Artificial IntelligenceMay-22-2024

Chain-of-Thought (CoT) reasoning could in principle enable a deeper understanding of a language model's (LM) internal reasoning. However, prior work suggests that LMs can answer questions similarly despite changes in their CoT, suggesting that those models are not truly using the CoT. We propose an reinforcement learning technique to produce CoTs that are sufficient alone for predicting future text, independent of other context. This methodology ensures that if the LM can predict future tokens, then it must have used the CoT to understand its context. We formalize the informativeness of a sender to a receiver LM as the degree to which the sender helps the receiver predict their future observations, and we define a "Markovian" LM as one which predicts future text given only a CoT as context. We derive a "Markovian training" procedure by applying our definition of informativeness to a Markovian LM and optimizing via policy gradient and Proximal Policy Optimization (PPO). We demonstrate our training algorithm's effectiveness on fifteen-term arithmetic problems, show the model utilizes the CoT, and externally validate that the generated CoT is meaningful and usable by another model.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2404.18988

Country:

North America > United States (0.28)
Europe > Middle East > Malta (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

Add feedback

Formally Verifying Deep Reinforcement Learning Controllers with Lyapunov Barrier Certificates

Mandal, Udayan, Amir, Guy, Wu, Haoze, Daukantas, Ieva, Newell, Fletcher Lee, Ravaioli, Umberto J., Meng, Baoluo, Durling, Michael, Ganai, Milan, Shim, Tobey, Katz, Guy, Barrett, Clark

arXiv.org Artificial IntelligenceMay-22-2024

Deep reinforcement learning (DRL) is a powerful machine learning paradigm for generating agents that control autonomous systems. However, the "black box" nature of DRL agents limits their deployment in real-world safety-critical applications. A promising approach for providing strong guarantees on an agent's behavior is to use Neural Lyapunov Barrier (NLB) certificates, which are learned functions over the system whose properties indirectly imply that an agent behaves as desired. However, NLB-based certificates are typically difficult to learn and even more difficult to verify, especially for complex systems. In this work, we present a novel method for training and verifying NLB-based certificates for discrete-time systems. Specifically, we introduce a technique for certificate composition, which simplifies the verification of highly-complex systems by strategically designing a sequence of certificates. When jointly verified with neural network verification engines, these certificates provide a formal guarantee that a DRL agent both achieves its goals and avoids unsafe behavior. Furthermore, we introduce a technique for certificate filtering, which significantly simplifies the process of producing formally verified certificates. We demonstrate the merits of our approach with a case study on providing safety and liveness guarantees for a DRL-controlled spacecraft.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2405.14058

Country:

North America > United States (0.28)
Europe (0.28)
Asia > Middle East > Israel (0.14)

Genre: Research Report > Promising Solution (0.68)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Clover: Closed-Loop Verifiable Code Generation

Sun, Chuyue, Sheng, Ying, Padon, Oded, Barrett, Clark

arXiv.org Artificial IntelligenceJan-30-2024

The use of large language models for code generation is a rapidly growing trend in software development. However, without effective methods for ensuring the correctness of generated code, this trend could lead to any number of undesirable outcomes. In this paper, we lay out a vision for addressing this challenge: the Clover paradigm, short for Closed-Loop Verifiable Code Generation, which reduces correctness checking to the more accessible problem of consistency checking. At the core of Clover lies a checker that performs consistency checks among code, docstrings, and formal annotations. The checker is implemented using a novel integration of formal verification tools and large language models. We provide a theoretical analysis to support our thesis that Clover should be effective at consistency checking. We also empirically investigate its feasibility on a hand-designed dataset (CloverBench) featuring annotated Dafny programs at a textbook level of difficulty. Experimental results show that for this dataset, (i) LLMs are reasonably successful at automatically generating formal specifications; and (ii) our consistency checker achieves a promising acceptance rate (up to 87%) for correct instances while maintaining zero tolerance for incorrect ones (no false positives).

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2310.17807

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.61)
Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.66)

Add feedback

Marabou 2.0: A Versatile Formal Analyzer of Neural Networks

Wu, Haoze, Isac, Omri, Zeljić, Aleksandar, Tagomori, Teruhiro, Daggitt, Matthew, Kokke, Wen, Refaeli, Idan, Amir, Guy, Julian, Kyle, Bassan, Shahaf, Huang, Pei, Lahav, Ori, Wu, Min, Zhang, Min, Komendantskaya, Ekaterina, Katz, Guy, Barrett, Clark

arXiv.org Artificial IntelligenceJan-25-2024

This paper serves as a comprehensive system description of version 2.0 of the Marabou framework for formal analysis of neural networks. We discuss the tool's architectural design and highlight the major features and components introduced since its initial release.

artificial intelligence, logic & formal reasoning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2401.14461

Country:

North America > United States (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)

Add feedback

Identifying and Mitigating the Security Risks of Generative AI

Barrett, Clark, Boyd, Brad, Burzstein, Elie, Carlini, Nicholas, Chen, Brad, Choi, Jihye, Chowdhury, Amrita Roy, Christodorescu, Mihai, Datta, Anupam, Feizi, Soheil, Fisher, Kathleen, Hashimoto, Tatsunori, Hendrycks, Dan, Jha, Somesh, Kang, Daniel, Kerschbaum, Florian, Mitchell, Eric, Mitchell, John, Ramzan, Zulfikar, Shams, Khawaja, Song, Dawn, Taly, Ankur, Yang, Diyi

arXiv.org Artificial IntelligenceDec-28-2023

Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1561/3300000041

2308.1484

Country:

Europe (1.00)
North America > United States > Wisconsin > Dane County > Madison (0.24)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Towards Efficient Verification of Quantized Neural Networks

Huang, Pei, Wu, Haoze, Yang, Yuting, Daukantas, Ieva, Wu, Min, Zhang, Yedi, Barrett, Clark

arXiv.org Artificial IntelligenceDec-27-2023

Quantization replaces floating point arithmetic with integer arithmetic in deep neural network models, providing more efficient on-device inference with less power and memory. In this work, we propose a framework for formally verifying properties of quantized neural networks. Our baseline technique is based on integer linear programming which guarantees both soundness and completeness. We then show how efficiency can be improved by utilizing gradient-based heuristic search methods and also bound-propagation techniques. We evaluate our approach on perception networks quantized with PyTorch. Our results show that we can verify quantized networks with better scalability and efficiency than the previous state of the art.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2312.12679

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Zhang, Zhenyu, Sheng, Ying, Zhou, Tianyi, Chen, Tianlong, Zheng, Lianmin, Cai, Ruisi, Song, Zhao, Tian, Yuandong, Ré, Christopher, Barrett, Clark, Wang, Zhangyang, Chen, Beidi

arXiv.org Artificial IntelligenceDec-18-2023

Large Language Models (LLMs), despite their recent impressive accomplishments, are notably cost-prohibitive to deploy, particularly for applications involving long-content generation, such as dialogue systems and story writing. Often, a large amount of transient state information, referred to as the KV cache, is stored in GPU memory in addition to model parameters, scaling linearly with the sequence length and batch size. In this paper, we introduce a novel approach for implementing the KV cache which significantly reduces its memory footprint. Our approach is based on the noteworthy observation that a small portion of tokens contributes most of the value when computing attention scores. We call these tokens Heavy Hitters (H$_2$). Through a comprehensive investigation, we find that (i) the emergence of H$_2$ is natural and strongly correlates with the frequent co-occurrence of tokens in the text, and (ii) removing them results in significant performance degradation. Based on these insights, we propose Heavy Hitter Oracle (H$_2$O), a KV cache eviction policy that dynamically retains a balance of recent and H$_2$ tokens. We formulate the KV cache eviction as a dynamic submodular problem and prove (under mild assumptions) a theoretical guarantee for our novel eviction algorithm which could help guide future work. We validate the accuracy of our algorithm with OPT, LLaMA, and GPT-NeoX across a wide range of tasks. Our implementation of H$_2$O with 20% heavy hitters improves the throughput over three leading inference systems DeepSpeed Zero-Inference, Hugging Face Accelerate, and FlexGen by up to 29$\times$, 29$\times$, and 3$\times$ on OPT-6.7B and OPT-30B. With the same batch size, H2O can reduce the latency by up to 1.9$\times$. The code is available at https://github.com/FMInference/H2O.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.14048

Country:

North America > United States > California (0.46)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback