Reviews: Learning to Specialize with Knowledge Distillation for Visual Question Answering
–Neural Information Processing Systems
For example, one model might be specialized for'what color is the umbrella?' and another for'how many people are wearing glasses?' while at test time they question may be'what color are the glasses?'. Specifically, they train independently ensembled base VQA models on the entire dataset, and then while training using MCL, subset of models are trained using oracle assignments (as in usual MCL) while the rest are trained to imitate the base models' activations. Strengths -- The paper is very nicely written. It starts with a clear description of the problem, the observations made by the authors, and then the proposed solution -- positioning it appropriately with respect to prior work -- and then experiments. Given the small dataset, MCL and CMCL perform worse than independent ensembling, while MCL-KD performs better.
Neural Information Processing Systems
Oct-7-2024, 05:02:37 GMT
- Technology: