HyST: LLM-Powered Hybrid Retrieval over Semi-Structured Tabular Data
Myung, Jiyoon, Park, Jihyeon, Han, Joohyung
–arXiv.org Artificial Intelligence
User queries in real-world recommendation systems often combine structured constraints (e.g., category, attributes) with unstructured preferences (e.g., product descriptions or reviews). We introduce HyST (Hybrid retrieval over Semi-structured Tabular data), a hybrid retrieval framework that combines LLM-powered structured filtering with semantic embedding search to support complex information needs over semi-structured tabular data. HyST extracts attribute-level constraints from natural language using large language models (LLMs) and applies them as metadata filters, while processing the remaining unstructured query components via embedding-based retrieval. Experiments on a semi-structured benchmark show that HyST consistently outperforms tradtional baselines, highlighting the importance of structured filtering in improving retrieval precision, offering a scalable and accurate solution for real-world user queries.
arXiv.org Artificial Intelligence
Aug-26-2025
- Country:
- Asia > South Korea
- Europe
- Czechia > Prague (0.05)
- Switzerland (0.04)
- North America > United States
- Massachusetts > Suffolk County
- Boston (0.04)
- New York > New York County
- New York City (0.04)
- Washington > King County
- Seattle (0.04)
- Massachusetts > Suffolk County
- Genre:
- Research Report > New Finding (0.46)
- Technology: