Benchmarking the Robustness of Agentic Systems to Adversarially-Induced Harms