Attestable Audits: Verifiable AI Safety Benchmarks Using Trusted Execution Environments

Schnabl, Christoph, Hugenroth, Daniel, Marino, Bill, Beresford, Alastair R.

Jul-1-2025–arXiv.org Artificial Intelligence

Benchmarks are important measures to evaluate safety and compliance of AI models at scale. However, they typically do not offer verifiable results and lack confidentiality for model IP and benchmark datasets. We propose Attestable Audits, which run inside Trusted Execution Environments and enable users to verify interaction with a compliant AI model. Our work protects sensitive data even when model provider and auditor do not trust each other. This addresses verification challenges raised in recent AI governance frameworks. We build a prototype demonstrating feasibility on typical audit benchmarks against Llama-3.1.

large language model, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

Jul-1-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)

Genre:
- Research Report (0.43)

Industry:
- Information Technology > Security & Privacy (1.00)
- Government > Regional Government
  - North America Government > United States Government (0.68)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (0.89)
  - Issues > Social & Ethical Issues (0.68)
  - Machine Learning > Neural Networks
    - Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found