AI For Matching Images With Spoken Word Gets A Boost From MIT
Children learn to speak, as well as recognize objects, people, and places, long before they learn to read or write. They can learn from hearing, seeing, and interacting without being given any instructions. So why shouldn't artificial intelligence systems be able to work the same way? That's the key insight driving a research project under way at MIT that takes a novel approach to speech and image recognition: Teaching a computer to successfully associate specific elements of images with corresponding sound files in order to identify imagery (say, a lighthouse in a photographic landscape) when someone in an audio clip says the word "lighthouse." Though in the very early stages of what could be a years-long process of research and development, the implications of the MIT project, led by PhD student David Harwath and senior research scientist Jim Glass, are substantial. Along with being able to automatically surface images based on corresponding audio clips and vice versa, the research opens a path to creating language-to-language translation without needing to go through the laborious steps of training AI systems on the correlation between two languages' words.
Feb-12-2017, 17:15:11 GMT
- Industry:
- Information Technology (0.30)
- Education > Curriculum
- Subject-Specific Education (0.40)
- Technology: