ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities