Stochasticity in Agentic Evaluations: Quantifying Inconsistency with Intraclass Correlation

Open in new window