QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?

Neural Information Processing Systems 

Large language models (LLMs) have shown impressive performance on reasoning benchmarks like math and logic. While many works have largely assumed well-defined tasks, real-world queries are often underspecified and only solvable by acquiring missing information.