Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models