Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models

Open in new window