multimodal and unimodal query
Export Reviews, Discussions, Author Feedback and Meta-Reviews
The most significant improvement was made for unimodal query when we increase the iteration number from 0 to 1 (margin of improvement: 0.034), but we still observe further gain by increasing it to 5 and 10 (margin of improvement: 0.048 and 0.05). Based on your suggestion on multi-prediction training that approximates joint likelihood, we evaluated the performance of the multimodal deep network trained jointly on $x$ and $y$ like in the original MP-DBM (i.e., randomly select subsets of variables from both data modalities $x$ and $y$ and predict them given the rest). In our preliminary results, the original MP-DBM style training jointly on $x$ and $y$ gave worse results than our proposed training scheme (i.e., predicting $x$ given $y$ and vice versa) for both multimodal and unimodal queries. We will include complete results in the revision. R38: Fine-tuning brings a significant improvement: before MDRNN fine-tuning, we obtained 0.632 and 0.521 test set mAPs for multimodal and unimodal queries, respectively, and these numbers go up to 0.686 and 0.607 mAPs after MDRNN fine-tuning.