Cross-lingual Retrieval for Iterative Self-Supervised Training (supplementary materials) 1 Experiment details

Feb-7-2026, 14:57:07 GMT–Neural Information Processing Systems

Becauseof the file size limit, we will release the source code and pretrained checkpoints after the anonymity period. To be able to make a fair comparison,we followed the same preprocessingsteps as described in [13]. In each iteration, we mine all90 language pairs in parallel, using8 GPUs for each pair, each pair taking about15 30 hours to finish. We lightly tune the margin score threshold using validation BLEU (using threshold score between 1.04and1.07.) For all experiments, we use Transformerwith 12 layers of encoder and 12 layers of decoder with model dimension of1024 on 16 heads ( 680M parameters). 1 We trained for maximum20,000 steps using label-smoothed cross-entropy loss with 0.2 label smoothing,0.3

artificial intelligence, machine translation, natural language, (12 more...)

Neural Information Processing Systems

Feb-7-2026, 14:57:07 GMT

Conferences PDF

Add feedback

Country:
- Asia
  - China > Hong Kong (0.05)
  - Middle East > Saudi Arabia
    - Northern Borders Province > Arar (0.05)
- Europe
  - Belgium (0.05)
  - Bulgaria > Sofia City Province
    - Sofia (0.05)

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Duplicate Docs Excel Report

Title
Cross-lingual Retrieval for Iterative Self-Supervised Training (supplementary materials) 1 Experiment details

Similar Docs Excel Report more

Title	Similarity	Source
None found