NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security
–Neural Information Processing Systems
Large Language Models (LLMs) are being deployed across various domains today. However, their capacity to solve Capture the Flag (CTF) challenges in cybersecurity has not been thoroughly evaluated.
Neural Information Processing Systems
Oct-10-2025, 04:59:21 GMT
- Country:
- Asia > Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.04)
- Europe > United Kingdom (0.04)
- North America
- Canada > British Columbia
- Vancouver (0.04)
- United States > Hawaii (0.04)
- Canada > British Columbia
- Asia > Middle East
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Education (1.00)
- Government
- Information Technology > Security & Privacy (1.00)
- Technology: