Consumers now experience AI mostly through image recognition to help categorize digital photographs and speech recognition that helps power digital voice assistants such as Apple Inc's Siri or Amazon.com But at a press briefing in San Francisco two days before Ng's Landing.ai In many factories, workers look over parts coming off an assembly line for defects. Ng showed a video in which a worker instead put a circuit board beneath a digital camera connected to a computer and the computer identified a defect in the part. Ng said that while typical computer vision systems might require thousands of sample images to become "trained," Landing.ai's
We present the recent advances along with an error analysis of the IBM speaker recognition system for conversational speech. Some of the key advancements that contribute to our system include: a nearest-neighbor discriminant analysis (NDA) approach (as opposed to LDA) for intersession variability compensation in the i-vector space, the application of speaker and channel-adapted features derived from an automatic speech recognition (ASR) system for speaker recognition, and the use of a DNN acoustic model with a very large number of output units (~10k senones) to compute the frame-level soft alignments required in the i-vector estimation process. We evaluate these techniques on the NIST 2010 SRE extended core conditions (C1-C9), as well as the 10sec-10sec condition. To our knowledge, results achieved by our system represent the best performances published to date on these conditions. For example, on the extended tel-tel condition (C5) the system achieves an EER of 0.59%. To garner further understanding of the remaining errors (on C5), we examine the recordings associated with the low scoring target trials, where various issues are identified for the problematic recordings/trials. Interestingly, it is observed that correcting the pathological recordings not only improves the scores for the target trials but also for the nontarget trials.
Amazon's 2016 has been record breaking on many fronts. The company recorded its sixth consecutive quarterly profit (previously, it mostly hemorrhaged cash). Meanwhile, this year marked Amazon's growing strength in hardware with its hit Echo home automation hub Amazon Echo, and its companion voice assistant Alexa. The company has also become force in entertainment, debuting a line of hit original shows through its Amazon Video Prime service. It's hard to imagine how Amazon could top 2016, but here are some likely moves by the Seattle-based Goliath in 2017: To save money over the past year, Amazon has been seeking to take over more shipping duties from the likes of UPS and FedEx by leasing trucks, planes, and ships.
To learn more about conversational AI, check out Yishay Carmiel's session Applications of neural-based models for conversational speech at the Artificial Intelligence Conference in San Francisco, Sept. 17-20, 2017. The dream of speech recognition is a system that truly understands humans speaking--in different environments, with a variety of accents and languages. For decades, people tackled this problem with no success. Pinpointing effective strategies for creating such a system seemed impossible. In the past years, however, breakthroughs in AI and deep learning have changed everything in the quest for speech recognition.
Last year, Apple witnessed several controversies regarding its speech recognition technology. To provide quality control in the company's voice assistant Siri, Apple asked its contractors to regularly hear the confidential voice recordings in the name of the "Siri Grading Program". However, to this matter, the company later apologised and published a statement where it announced the changes in the Siri grading program. This year, the tech giant has been gearing up a number of researchers regarding speech recognition technology to upgrade its voice assistant. Recently, the researchers at Apple developed an AI model which can perform automatic speech transcription and speaker recognition simultaneously.