Visually grounded few-shot word acquisition with fewer shots

Open in new window