Distilling Large Vision-Language Model with Out-of-Distribution Generalizability