Benchmarking Practices in LLM-driven Offensive Security: Testbeds, Metrics, and Experiment Design

Open in new window