Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering
Cheng, Hao, Fang, Hao, Liu, Xiaodong, Gao, Jianfeng
–arXiv.org Artificial Intelligence
Given its effectiveness on knowledge-intensive natural language processing tasks, dense retrieval models have become increasingly popular. Specifically, the de-facto architecture for open-domain question answering uses two isomorphic encoders that are initialized from the same pretrained model but separately parameterized for questions and passages. This bi-encoder architecture is parameter-inefficient in that there is no parameter sharing between encoders. Further, recent studies show that such dense retrievers underperform BM25 in various settings. We thus propose a new architecture, Task-aware Specialization for dense Retrieval (TASER), which enables parameter sharing by interleaving shared and specialized blocks in a single encoder. Our experiments on five question answering datasets show that TASER can achieve superior accuracy, surpassing BM25, while using about 60% of the parameters as bi-encoder dense retrievers. In out-of-domain evaluations, TASER is also empirically more robust than bi-encoder dense retrievers. Our code is available at https://github.com/microsoft/taser.
arXiv.org Artificial Intelligence
May-22-2023
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.28)
- Europe
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- North America
- Canada (0.04)
- Dominican Republic (0.04)
- United States
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- New York > New York County
- New York City (0.04)
- Texas (0.04)
- Washington > King County
- Seattle (0.04)
- Louisiana > Orleans Parish
- Oceania > Australia
- Asia > Japan
- Genre:
- Research Report > New Finding (0.49)
- Technology: