Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? -- A computational investigation

Open in new window