Compressing Visual-linguistic Model via Knowledge Distillation