Appendix
–Neural Information Processing Systems
We limit the target languages for this augmentation process to Arabic, Finnish, Japanese, Korean, Russian, Spanish, Swedish, Hebrew, Thai,Danish,French,Italian,Dutch,Polish,andPortuguese. Interestingly,justaddingthislanguage code effectively changes the outputs as shown in Table 7. We further subsample 50% of the synthetically generated questions. During inference, we first retrieve top 15 passages using mDPR, and then feed the questions andconcatenated passages intothemGEN model, withlanguage tags. The gray dots concentrated in the lower right part in the first figure represent encoded Thai embeddings.
Neural Information Processing Systems
Feb-8-2026, 08:06:05 GMT