STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases

May-27-2025, 19:59:00 GMT–Neural Information Processing Systems

Answering real-world complex queries, such as complex product search, often requires accurate retrieval from semi-structured knowledge bases that involve blend of unstructured (e.g., textual descriptions of products) and structured (e.g., entity relations of products) information. However, many previous works studied textual and relational retrieval tasks as separate topics. To address the gap, we develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Relational Knowledge Bases. Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine. We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties, together with their ground-truth answers (items).

benchmarking llm retrieval, query, textual and relational knowledge base, (4 more...)

Neural Information Processing Systems

May-27-2025, 19:59:00 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology
  - Knowledge Management > Knowledge Engineering (0.89)
  - Artificial Intelligence
    - Representation & Reasoning > Expert Systems (0.89)
    - Natural Language > Large Language Model (0.63)