IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation

Mar-21-2026, 20:46:39 GMT–Neural Information Processing Systems

As Large Language Models (LLMs) become more capable of handling increasingly complex tasks, the evaluation set must keep pace with these advancements to ensure it remains sufficiently discriminative. Item Discrimination (ID) theory, which is widely used in educational assessment, measures the ability of individual test items to differentiate between high and low performers. Inspired by this theory, we propose an ID-induced prompt synthesis framework for evaluating LLMs so that the evaluation set continually updates and refines according to model abilities.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

Mar-21-2026, 20:46:39 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)