The supplementary materials includes a detailed description of implementation details for experiments

Neural Information Processing Systems 

We use BLIP-2 models built on the FLAN-T5 language model family. We use the same padding side as the FLAN-T5 models. We use a batch size of 8 for all datasets and models. The Q-former is kept in full precision. To produce decompositions, we use multinomial beam search sampling with 5 beams and a top-p of 0.95.

Similar Docs  Excel Report  more

TitleSimilaritySource
None found