"Automatic speech recognition (ASR) is one of the fastest growing and commercially most promising applications of natural language technology. Speech is the most natural communicative medium for humans in many situations, including applications such as giving dictation; querying database or information-retrieval systems; or generally giving commands to a computer or other device, especially in environments where keyboard input is awkward or impossible (for example, because one's hands are required for other tasks)."
– from Linguistic Knowledge and Empirical Methods in Speech Recognition. By Andreas Stolcke. (1997). AI Magazine 18 (4): 25-32.
LivePerson, Inc., a global leader in Conversational AI, announced two major strategic acquisitions: VoiceBase, a leader in real-time speech recognition and conversational analytics; and Tenfold, the world's most advanced customer engagement platform for integrating communication systems with leading CRM and support services. Through these acquisitions, three powerful technologies combine to create a unified, AI-enabled system for customer experience: VoiceBase's superior speech recognition and analytics capabilities, Tenfold's advanced voice, messaging, and CRM integrations, and LivePerson's industry-leading Conversational AI and asynchronous messaging. Brands can now enable natural, conversational consumer experiences that carry context and continuity across all channels, powered through a single automated voice and messaging desktop experience. Acquiring VoiceBase and Tenfold accelerates LivePerson's vision to help brands gain complete ownership and visibility over engagements in the channels customers care about, inclusive of voice and messaging. These companies bring voice intelligence and AI technologies to support LivePerson's upcoming voice capabilities within its world-class conversational AI messaging platform. "Brands want to accelerate their use of Voice and Conversational AI with deep connective tissue into their systems," said Rob LoCascio, founder and CEO of LivePerson.
Google has activated a safety feature that lets minors under 18 request that images of themselves be removed from search results, The Verge has reported. Google first announced the option back in August as part of a slate of new safety measures for kids, but it's now rolling out widely to users. Google said it will remove any images of minors "with the exception of case of compelling public interest or newsworthiness." The requests can be made by minors, their parents, guardians or other legal representatives. To do so, you'll need to supply the URLs you want removed, the name and age of the minor and the name of the person acting on their behalf.
Recently I started playing with the scripts of YouTube videos, which are uploaded by creators along with the videos or transcribed automatically by the website through speech recognition systems. Currently, I'm looking for ways to display the content of a video in a graphical way that allows me to quickly explore its contents without having to watch it all. In the longer term, my goal is to set up a full "video script explorer" that you can use online to quickly overview what the different sections of the video talk about -stay tuned because this promises to be a fun project, plus maybe useful too! For the moment I have made some interesting progress that I will share here. It's all manual steps so for the moment there is no code.
Intelligence agency GCHQ has signed a deal with Amazon Web Services (AWS) to host classified material and boost the use of artificial intelligence for espionage purposes. Although the procurement of cloud infrastructure from AWS was signed off by GCHQ, it will also be used by sister spy services MI5 and MI6, and the Ministry of Defence during joint operations, according to the Financial Times. The deal had not been made public and was signed earlier this year, according to the report. It is worth £500m to £1bn over the next decade, FT sources said. In a February opinion piece for the Financial Times, GCHQ director Jeremy Fleming said that the agencies "expect AI to be at the heart of this transformation and we want to be transparent about its use."
Although great progress has been made in automatic speech recognition (ASR), significant performance degradation still exists in very noisy environments. Over the past few years, Chinese startup AISpeech has been developing very deep convolutional neural networks (VDCNN),21 a new architecture the company recently began applying to ASR use cases. Different than traditional deep CNN models for computer vision, VDCNN features novel filter designs, pooling operations, input feature map selection, and padding strategies, all of which lead to more accurate and robust ASR performance. Moreover, VDCNN is further extended with adaptation, which can significantly alleviate the mismatch between training and testing. Factor-aware training and cluster-adaptive training are explored to fully utilize the environmental variety and quickly adapt model parameters.
The Pixel 6 is the most intriguing phone Google has made in years. Not only is it a return to premium design with eye-catching colors and up to a 120Hz screen, it's also powered by the company's first mobile processor -- Tensor. With it, Google is promising serious improvements in AI performance and photography, including better voice recognition and Assistant features. Google also finally upgraded the Pixel's camera hardware instead of just relying on its processing smarts. That's not to say it's overlooked software this year.
Far from the stuff of fantasy, artificial intelligence (AI) has become an integral part of our lives. Even the most tech-adverse among us use AI, perhaps unknowingly, when we type a query into Google or plug in GPS. Those who embrace technology, on the other hand, actively look for ways AI can improve their work and personal lives. Though it seems AI is a new phenomenon, the technology has been around since 1956. While AI's popularity has waxed and waned, it gained legitimacy in the 1990s and 2000s when a chess computer program beat the grand chess master Garry Kasparov and speech recognition software was installed on Windows.
Artificial intelligence (AI) is intelligence exhibited by machines. In computer science AI research is defined as the study of intelligent agents: any device that perceives its environment and takes actions that maximize its chance of success at some goal. Colloquially, the term artificial intelligence is applied when a machine mimics cognitive functions that humans associate with other human minds, such as learning and problem solving. AI can be categorized into subfields that focus on specific problems or tasks, such as machine learning, perception, speech recognition, planning or robotic operating systems. General-purpose artificial intelligences are still hypothetical.
Google has augmented and widened access to some of YouTube's audio AI-based features. The update includes extending auto-captioning to any YouTube channel and automatic caption translation to mobile devices and lays out plans for even more inclusion of the platform's speech recognition and translation technology. The most notable immediate change is that YouTube has ended the 1,000 subscriber minimum to enable live auto captions. The limit on auto captioning may have been a way to encourage the promotion of YouTube channels or out of concern for limited computing resources, but that no longer matters. The auto captions will also be available in more languages soon, upping the accessibility of non-English content on YouTube.
The growth of industry 4.0 has greatly emphasized the development of Artificial Intelligence (AI) related infrastructure at different levels of the value chain and business cycles where AI has increasingly contributed to the betterment of human life (Cao et al., 2021; Coombs, 2020; Grover et al., 2020; Sipior, 2020). Minsky (1968, p. v) defines AI as the "the science of making machines do things that would require intelligence if done by men". The emergence of voice assistants has enabled organisations to develop systems and processes where human interaction with AI has become the norm (Bawack et al., 2021; Hu et al., 2021). The first voice assistant titled "Shoebox" was introduced by IBM at the Seattle world fair in 1962. Apple has been working on voice assistants since 1990 and has developed a pilot with Macintosh Plain Talk during 1993.