Vintage Code, Modern Judges: Meta-Validation in Low Data Regimes