"Automatic speech recognition (ASR) is one of the fastest growing and commercially most promising applications of natural language technology. Speech is the most natural communicative medium for humans in many situations, including applications such as giving dictation; querying database or information-retrieval systems; or generally giving commands to a computer or other device, especially in environments where keyboard input is awkward or impossible (for example, because one's hands are required for other tasks)."
– from Linguistic Knowledge and Empirical Methods in Speech Recognition. By Andreas Stolcke. (1997). AI Magazine 18 (4): 25-32.
Amazon Web Services on Tuesday announced new capabilities for three of its AI services -- the text-to-speech service Amazon Polly, the real-time translation service Amazon Translate; and the multi-language transcription service Amazon Transcribe. The expanded capabilities follow a series of similar announcements made recently, all in advance of the annual AWS re:Invent conference. Last year's re:Invent conference was used to roll out a slew of new services, with many bringing customers new machine learning capabilities -- including Amazon Translate and Amazon Transcribe. AI and machine learning are quickly moving from a competitive advantage in the cloud to table stakes, so it makes sense for AWS to improve its existing services ahead of this year's conference. Specifically, Amazon is announcing support for 14 new languages, distinct accents and voices across Polly, Translate and Transcribe.
You can certainly use Amazon's Alexa to turn on some devices, but that support is frequently limited. What if a device is in a low-power state and won't respond to your hue and cry? Amazon now has a solution. It recently added a "wake-on-LAN" control method that can turn on sleeping connected gadgets in the home that otherwise won't respond to voice control. Device makers just need to craft Alexa skills that use the new control to have it turn on TVs and other hardware on the local network.
Bottos will soon offer great opportunities to support the development of Artificial Intelligence, with the most important step being the data and model marketplace. Thanks to the underlying blockchain infrastructure and other tools like smart contracts, users will be able to monetize their efforts to produce, clean, and ultimately, sell their data safely and conveniently. Bottos will be a great companion to everyone involved in the development of AI models and programs. Karen is a computer scientist that cares greatly for her grandmother, who is increasingly fragile and in need of assistance. While driving, Karen comes up with an interesting idea about an image and speech recognition system that, with the right development, may help seniors live longer in their own houses, autonomously, without moving to a retirement house and limiting the employment of costly nursing services.
Transcription services can save you time, effort, and if you're anything like me, allows you to avoid having to listen to your own voice -- a concept many of us find cringeworthy. While there are many manual services out there in which you outsource transcription tasks and have someone else manually type out conversations or interviews, in recent times, automatic transcription services have begun to appear online. Automatic services often promise results in a fraction of the time that uploading, sending, and waiting for manual transcriptions require, but are all created equal? In order to find out, ZDNet tested a total of six auto-transcription offers online. For each test, the same audio file was used, a 15-minute recording of an interview I had undertaken with a researcher concerning cybersecurity, ransomware, and botnets.
Smartphone maker Huawei is planning on taking its popular voice assistant outside of China and competing with Amazon, Google and Apple internationally, according to a report from CNBC. The Chinese technology firm is apparently working on a version of its voice assistant Xiaoyi that will work outside of China, though it hasn't revealed what languages the AI will speak, nor when it will be available for other markets. Prior to developing Xiaoyi, Huawei was reliant on third-party voice assistants including Google Assistant and Amazon Alexa. The company's first smart speaker, the AI Cube, relied on Alexa. Huawei has been building on its own voice assistant in recent months.
Despite showing state-of-the-art performance, deep learning for speech recognition remains challenging to deploy in on-device edge scenarios such as mobile and other consumer devices. Recently, there have been greater efforts in the design of small, low-footprint deep neural networks (DNNs) that are more appropriate for edge devices, with much of the focus on design principles for hand-crafting efficient network architectures. In this study, we explore a human-machine collaborative design strategy for building low-footprint DNN architectures for speech recognition through a marriage of human-driven principled network design prototyping and machine-driven design exploration. The efficacy of this design strategy is demonstrated through the design of a family of highly-efficient DNNs (nicknamed EdgeSpeechNets) for limited-vocabulary speech recognition. Experimental results using the Google Speech Commands dataset for limited-vocabulary speech recognition showed that EdgeSpeechNets have higher accuracies than state-of-the-art DNNs (with the best EdgeSpeechNet achieving ~97% accuracy), while achieving significantly smaller network sizes (as much as 7.8x smaller) and lower computational cost (as much as 36x fewer multiply-add operations, 10x lower prediction latency, and 16x smaller memory footprint on a Motorola Moto E phone), making them very well-suited for on-device edge voice interface applications.
Voice recognition software giant Nuance on Monday announced that it's selling its document imaging business to Kofax for $400 million. Kofax is a supplier of of intelligent automation software and plans to use the complementary product line from Nuance to expand the functionality of its portfolio. Specifically, Kofax CEO Reynolds Bish said the purchase will add key technologies such as cloud compatibility, scan-to-archive, scan-to-workflow, print management and document security to Kofax's core product. "In addition we will now be able to combine the best capture and print management capabilities available in the market into one product portfolio," he said. For Nuance, shedding the document imaging business -- its smallest business segment in terms of revenue -- should help the company focus resources on its more successful product lines.
Samsung apparently has enough confidence in its half-baked voice assistant, Bixby, that it plans to open it up to developers, according to the WSJ. At its San Francisco developer conference next week, it plans to roll out new features for the assistant and open it up completely to developers, much as Amazon and Google have done with Alexa and Google Assistant. It will reportedly show developers how they can create Alexa-like skills for ordering food or hailing rides, called "capsules." The main problem with Bixby is that just 6 percent of American's use it, compared to 24 and 20 percent who use Alexa and Google Assistant, respectively. This is despite the fact that Bixby is built into numerous smartphones, TVs and even appliances that Samsung sells.
Facebook's Portal looked like a slick alternative to the Amazon Echo speaker when it launched earlier this month, but problems abounded behind the scenes. Facebook had already delayed the video-calling device due to privacy concerns around the Cambridge Analytica scandal. And when it finally did launch, there was a glaring omission: no voice assistant from Facebook. Instead it came with Alexa, meaning anyone who bought the 15.6-inch version for $350 got an awkward gateway to Amazon, whose competing Echo Show cost at least $100 less. It also meant Facebook was blocked from collecting any speech data to train its voice technology further.
As our British readers struggle with daylight savings ("struggle" an extra hour in bed), we saw China's first private satellite launch not go as planned, the original Wii remote prototype goes to auction and you can control your Roku device through Google's voice assistant. Hey Google, save me the trouble of finding the remote. Roku's Google Assistant control is here. If you're using a TV or player running at least Roku OS 8.1, you can link the Google Home app to your Roku account and control core functions using only voice and an "on Roku" suffix. You can launch channels, search for shows and control playback on most devices, while TV owners can turn on the set, adjust volume or switch inputs.