Memorize or Generalize? Evaluating LLM Code Generation with Evolved Questions

Open in new window