Appendices for Baleen A Data Details

Neural Information Processing Systems 

Table 6: Sizes of the splits of the datasets used in this work. It contains approximately 5M passages (1.5 GiB uncompressed). We implement Baleen using Python 3.7 and PyTorch 1.6 and rely extensively on the HuggingFace We train and test with automatic mixed precision that is built into PyTorch. To train the single-hop retriever used to initiate the supervision procedure of 3.2, we follow the training strategy of Khattab et al. ColBERT model to create training triples, and then we train our retriever (in this case, FLIPR for first-hop) with these triples.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found