GlitchBench: Can large multimodal models detect video game glitches?