Beyond Ranked Lists: The SARAL Framework for Cross-Lingual Document Set Retrieval
Agarwal, Shantanu, Barry, Joel, Boschee, Elizabeth, Miller, Scott
–arXiv.org Artificial Intelligence
Machine Translation for English Retrieval of Information in Any Language (MATERIAL) is an IARPA initiative targeted to advance the state of cross-lingual information retrieval (CLIR). This report provides a detailed description of Information Sciences Institute's (ISI's) Summarization and domain-Adaptive Retrieval Across Language's (SARAL's) effort for MATERIAL. Specifically, we outline our team's novel approach to handle CLIR with emphasis in developing an approach amenable to retrieve a query-relevant document \textit{set}, and not just a ranked document-list. In MATERIAL's Phase-3 evaluations, SARAL exceeded the performance of other teams in five out of six evaluation conditions spanning three different languages (Farsi, Kazakh, and Georgian).
arXiv.org Artificial Intelligence
Nov-6-2025
- Country:
- Africa > East Africa (0.04)
- Europe > France
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
- North America > United States
- California (0.14)
- Ohio > Franklin County
- Columbus (0.04)
- Genre:
- Research Report (0.84)
- Industry:
- Technology: