Can We Use Probing to Better Understand Fine-tuning and Knowledge Distillation of the BERT NLU?