Visually Grounded Speech Models have a Mutual Exclusivity Bias

Open in new window