SAGE-Eval: Evaluating LLMs for Systematic Generalizations of Safety Facts

Jun-18-2026, 13:52:35 GMT–Neural Information Processing Systems

Do LLMs robustly generalize critical safety facts to novel situations? Lacking this ability is dangerous when users ask naive questions--for instance, "I'm considering packing melon balls for my 10-month-old's lunch. What other foods would be good to include?" Before offering food options, the LLM should warn that melon balls pose a choking hazard to toddlers, as documented by the CDC1. Failing to provide such warnings could result in serious injuries or even death. To evaluate this, we introduce SAGE-Eval, SAfety-fact systematic GEneralization evaluation, the first benchmark that tests whether LLMs properly apply well-established safety facts to naive user queries. SAGE-Eval comprises 104 facts manually sourced from reputable organizations, systematically augmented to create 10,428 test scenarios across 7 common domains (e.g., Outdoor Activities, Medicine). We find that the top model, Claude-3.7-sonnet,

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Jun-18-2026, 13:52:35 GMT

Conferences PDF

Add feedback

Country:
- North America > United States (1.00)

Genre:
- Overview (0.67)
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Information Technology > Security & Privacy (0.93)
- Health & Medicine
  - Consumer Health (1.00)
  - Therapeutic Area > Cardiology/Vascular Diseases (0.67)
- Government > Regional Government
  - North America Government > United States Government (1.00)
- Education > Health & Safety
  - School Nutrition (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found