Aligning Offline Metrics and Human Judgments of Value for Code Generation Models

Open in new window