CrochetBench: Can Vision-Language Models Move from Describing to Doing in Crochet Domain?