AI Judges in Design: Statistical Perspectives on Achieving Human Expert Equivalence With Vision-Language Models