Multi-Facet Blending for Faceted Query-by-Example Retrieval
Do, Heejin, Ryu, Sangwon, Kim, Jonghwi, Lee, Gary Geunbae
–arXiv.org Artificial Intelligence
With the growing demand to fit fine-grained user intents, faceted query-by-example (QBE), which retrieves similar documents conditioned on specific facets, has gained recent attention. However, prior approaches mainly depend on document-level comparisons using basic indicators like citations due to the lack of facet-level relevance datasets; yet, this limits their use to citation-based domains and fails to capture the intricacies of facet constraints. In this paper, we propose a multi-facet blending (FaBle) augmentation method, which exploits modularity by decomposing and recomposing to explicitly synthesize facet-specific training sets. We automatically decompose documents into facet units and generate (ir)relevant pairs by leveraging LLMs' intrinsic distinguishing capabilities; then, dynamically recomposing the units leads to facet-wise relevance-informed document pairs. Our modularization eliminates the need for pre-defined facet knowledge or labels. Further, to prove the FaBle's efficacy in a new domain beyond citation-based scientific paper retrieval, we release a benchmark dataset for educational exam item QBE. FaBle augmentation on 1K documents remarkably assists training in obtaining facet conditional embeddings.
arXiv.org Artificial Intelligence
Dec-2-2024
- Country:
- Asia
- China > Hong Kong (0.04)
- Middle East > UAE
- Abu Dhabi Emirate > Abu Dhabi (0.04)
- South Korea (0.04)
- Europe > Spain
- Catalonia > Barcelona Province > Barcelona (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- United States
- Florida (0.04)
- New York > New York County
- New York City (0.04)
- Washington > King County
- Seattle (0.04)
- Canada > Ontario
- Asia
- Genre:
- Research Report (1.00)
- Industry:
- Education (0.67)
- Health & Medicine > Therapeutic Area (0.46)
- Law (0.46)
- Technology: