NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security
–Neural Information Processing Systems
Large Language Models (LLMs) are being deployed across various domains today. However, their capacity to solve Capture the Flag (CTF) challenges in cybersecurity has not been thoroughly evaluated.
Neural Information Processing Systems
Oct-10-2025, 04:59:21 GMT
- Country:
- Europe > United Kingdom (0.04)
- North America
- United States > Hawaii (0.04)
- Canada > British Columbia
- Vancouver (0.04)
- Asia > Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Information Technology > Security & Privacy (1.00)
- Education (1.00)
- Government
- Technology: