YouTube's automatic captioning system can now describe sound effects

#artificialintelligence 

YouTube has long had an automatic captioning system that, thanks to Google's machine learning advances in recent years, has gotten pretty good at automatically transcribing spoken words in a video. As the company announced today, its technology is now able to take this a step further by also captioning some of the ambient sounds like [LAUGHTER], [APPLAUSE] and [MUSIC]. For now, the automatic effects captioning is actually restricted to those exactly these three sounds. The reason for this, Google says, is due to the fact that these are also exactly the sounds that most video producers manually caption right now. "While the sound space is obviously far richer and provides even more contextually relevant information than these three classes, the semantic information conveyed by these sound effects in the caption track is relatively unambiguous, as opposed to sounds like [RING] which raises the question of "what was it that rang – a bell, an alarm, a phone?,"

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found