While computer scientists have yet to build a working "universal translator" such as the one first described in the 1945 science-fiction novella "First Contact" and later employed by the crew of the Starship Enterprise on "Star Trek," the hurdles to creating one are being cleared. That is because the practical need for instant or simultaneous speech-to-speech translation is increasingly important in a number of applications. Take, for example, the hypergrowth of social networking and Skype chats that demand bidirectional, reliable, immediate translations. Similarly, when natural disasters strike, the problem of aid workers struggling to communicate with the stricken who often speak other languages can become overwhelming.
If you've ever been struggling with a foreign language dictionary abroad, wishing that you could simply speak into a machine and have your chat translated for you, NTT (News - Alert) Docomo may be ready to make your wish come true. The company, which is Japan's number one cell phone carrier is about to begin offering a new real-time speech-to-speech translation service that you can use both in person and over the phone during a call, according to Geek.com According to Japanese news services, the solution is the first automated chat translation service in the world that is available on a standard cell phone. The new product combines several cutting-edge technologies: advanced speech recognition, machine translation and text-to-speech conversion of the translated results, says Geek.com. The services to power the solution will be cloud-based, says NTT Docomo.
Simultaneous speech-to-speech translation is widely useful but extremely challenging, since it needs to generate target-language speech concurrently with the source-language speech, with only a few seconds delay. In addition, it needs to continuously translate a stream of sentences, but all recent solutions merely focus on the single-sentence scenario. As a result, current approaches accumulate latencies progressively when the speaker talks faster, and introduce unnatural pauses when the speaker talks slower. To overcome these issues, we propose Self-Adaptive Translation (SAT) which flexibly adjusts the length of translations to accommodate different source speech rates. At similar levels of translation quality (as measured by BLEU), our method generates more fluent target speech (as measured by the naturalness metric MOS) with substantially lower latency than the baseline, in both Zh <-> En directions.
India is a melting pot of multiple cultures, religions, diaspora and languages. Although 22 languages are recognised officially, more than 100 languages and dialects are spoken across the country. In the past decade, India has witnessed stupendous growth digitally - in 2019, the number of smartphone users in rural areas surpassed that of urban India. There is a burgeoning market for digital products, going well beyond borders of urban pockets. However, less than 1% of content on the Internet is in English.