SetBERT: Enhancing Retrieval Performance for Boolean Logic and Set Operation Queries
Mai, Quan, Gauch, Susan, Adams, Douglas
–arXiv.org Artificial Intelligence
We introduce SetBERT, a fine-tuned BERT-based model designed to enhance query embeddings for set operations and Boolean logic queries, such as Intersection (AND), Difference (NOT), and Union (OR). SetBERT significantly improves retrieval performance for logic-structured queries, an area where both traditional and neural retrieval methods typically underperform. We propose an innovative use of inversed-contrastive loss, focusing on identifying the negative sentence, and fine-tuning BERT with a dataset generated via prompt GPT. Furthermore, we demonstrate that, unlike other BERT-based models, fine-tuning with triplet loss actually degrades performance for this specific task. Our experiments reveal that SetBERT-base not only significantly outperforms BERT-base (up to a 63% improvement in Recall) but also achieves performance comparable to the much larger BERT-large model, despite being only one-third the size.
arXiv.org Artificial Intelligence
Jun-26-2024
- Country:
- Asia
- Europe (0.04)
- North America
- Mexico (0.04)
- United States > Arkansas
- Washington County > Fayetteville (0.14)
- South America > Brazil (0.04)
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Leisure & Entertainment (0.46)
- Media > Film (0.68)
- Technology: