Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of LLMs
Li, Zongxia, Calvo-Bartolomé, Lorena, Hoyle, Alexander, Xu, Paiheng, Dima, Alden, Fung, Juan Francisco, Boyd-Graber, Jordan
–arXiv.org Artificial Intelligence
A common use of NLP is to facilitate the understanding of large document collections, with a shift from using traditional topic models to Large Language Models. Yet the effectiveness of using LLM for large corpus understanding in real-world applications remains under-explored. This study measures the knowledge users acquire with unsupervised, supervised LLM-based exploratory approaches or traditional topic models on two datasets. While LLM-based methods generate more human-readable topics and show higher average win probabilities than traditional models for data exploration, they produce overly generic topics for domain-specific datasets that do not easily allow users to learn much about the documents. Adding human supervision to the LLM generation process improves data exploration by mitigating hallucination and over-genericity but requires greater human effort. In contrast, traditional. models like Latent Dirichlet Allocation (LDA) remain effective for exploration but are less user-friendly. We show that LLMs struggle to describe the haystack of large corpora without human help, particularly domain-specific data, and face scaling and hallucination limitations due to context length constraints. Dataset available at https://huggingface. co/datasets/zli12321/Bills.
arXiv.org Artificial Intelligence
Feb-20-2025
- Country:
- Africa > Mali (0.04)
- Asia > Middle East
- Jordan (0.05)
- Europe
- Germany > Berlin (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- Monaco (0.04)
- Spain > Galicia
- Madrid (0.04)
- United Kingdom (0.14)
- North America
- Puerto Rico (0.04)
- United States
- California
- San Francisco County > San Francisco (0.14)
- Ventura County > Thousand Oaks (0.04)
- District of Columbia > Washington (0.04)
- Florida > Miami-Dade County
- Miami (0.04)
- Maryland > Prince George's County
- College Park (0.04)
- California
- South America > Chile
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Education (1.00)
- Government > Regional Government
- Health & Medicine (0.93)
- Law (0.93)
- Water & Waste Management > Water Management
- Water Supplies & Services (0.67)
- Technology: