Natural Language Processing (NLP), the ability of a software program to understand human language as it is spoken, has seen major breakthroughs, thanks to Artificial Intelligence (AI) and improved access to fast processors and cloud computing. With the introduction of more personal assistants, better smartphone functionality, and the evolution of Big Data to automate even more regular human jobs, NLP adoption is projected to gain up steam in the future years. SoundHound creates AI and conversational intelligence systems that are voice-enabled. It offers a Speech-to-Meaning engine as well as Deep Meaning Understanding technology, which may be integrated into other services and devices. It also creates music recognition apps and voice search assistants.
Listen to this episode from Tcast on Spotify. Voice recognition software is getting better and better. Once upon a time, it was something that was extremely clunky and unreliable and even the best systems required you to spend far too much time training them while speaking extremely slowly and enunciating like…well…like a computer. However, at last, the systems have improved to the point where it’s possible to at least accurately convey meaning through talk to text features without having to clarify every other word. In fact, I know a trucker who does most of his communication using talk to text on his phone. It makes a few mistakes here and there but its accuracy is still pretty impressive considering he’s speaking normally while in a large moving vehicle. Then there are the voice assistants on our phones. Whether you talk to Siri, Alexa, or Cortana (all four of you, you know who you are) that voice recognition starts out needing a little training but nothing like it used to. And the more you use it to look up local restaurants, find a factoid to settle an argument or to book a hotel room, the more accurate it gets. Now, they are even in the homes of many, listening constantly for you to need their assistance with something – everything from dimming the lights to spinning up your favorite playlist on Spotify. The improvements in this software hold a lot of potential. It has already been used for years in business to accommodate certain employees who may not be able to speak clearly or who lose the use of their arms. It is also a much more efficient way to record information than the increasingly dated keyboard. Typing is inherently inefficient, creating the possibility for misspellings that need to be corrected lest they convey an unintended meaning. It also requires a keyboard, which adds space, weight and money to your computer. As voice recognition software improves, the keyboard can be replaced with a simple microphone, probably the one on your phone. Imagine being able to compose reliable messages for business, a book, notes on a law case and have them all transcribed without having to take the time to proofread them. The time savings would be impressive. Or perhaps a more mundane situation in which you’re sitting at home and have a craving for pizza, but you can’t quite remember the name of the place you got it from last month. You throw the question out into the air and your device reminds you of the name, the price and asks you if you’d like it to order a pizza for you. If you think about it, Alexa and other smart devices are only a step or two away from that level of functionality. Another use would be in hospitals. Embedded microphones would record conversations with your doctor, highlighting the important points and recording all of the important information. This would save time and increase efficiency in a number of ways. No more would nurses and admins have to spend hours on data entry, with all the potential transcription errors that entail. Incidentally, that would also save you having to answer the same questions three times every time you go in for a checkup. It also means no one, or at least very few people have to come in contact with the Petri dishes known as keyboards in an environment that should be kept as sterile as possible. Lectures and presentations could be recorded and transcribed instantly, making information readily available in real time. The possibilities are enormous. Yet, there are potential problems that arise, namely, who owns all that data getting generated and recorded? Is it the place where the recording happens? The place where they are stored? Some other party? At TARTLE, we believe all the data you generate is yours. So if it’s your information and your data that is being recorded, then you deserve to be the primary beneficiary of sharing it, or of deciding whether you want to share that data or not. These are questions that will be addressed sooner or later in the legislative realm which is why we are encouraging people to sign up at tartle.co to join the TARTLE movement. Together we can help steer that eventual legislation in a direction that will benefit not just a few, but each person who works to generate that data in the first place. What’s your data worth? www.tartle.co
Current computational-emotion research has focused on applying acoustic properties to analyze how emotions are perceived mathematically or used in natural language processing machine learning models. With most recent interest being in analyzing emotions from the spoken voice, little experimentation has been performed to discover how emotions are recognized in the singing voice -- both in noiseless and noisy data (i.e., data that is either inaccurate, difficult to interpret, has corrupted/distorted/nonsense information like actual noise sounds in this case, or has a low ratio of usable/unusable information). Not only does this ignore the challenges of training machine learning models on more subjective data and testing them with much noisier data, but there is also a clear disconnect in progress between advancing the development of convolutional neural networks and the goal of emotionally cognizant artificial intelligence. By training a new model to include this type of information with a rich comprehension of psycho-acoustic properties, not only can models be trained to recognize information within extremely noisy data, but advancement can be made toward more complex biofeedback applications -- including creating a model which could recognize emotions given any human information (language, breath, voice, body, posture) and be used in any performance medium (music, speech, acting) or psychological assistance for patients with disorders such as BPD, alexithymia, autism, among others. This paper seeks to reflect and expand upon the findings of related research and present a stepping-stone toward this end goal.
My friend Robert Burton, a neurologist and author, wanted to share a song with me last year, and sent me a link to an NPR Tiny Desk Concert. "It's wonderful to see truly new and inspiring music," he wrote. I clicked open the link to a band who appeared to have journeyed from their mountain village in Russia to busk for tourists in the city square. Three women wore long white wedding dresses, thick strands of bead necklaces, and Cossack hats that towered from their heads like minarets of black wool. They played, respectively, a cello, djembe drum, and floor tom drum. They were joined by an accordion player who could pass for a bearded hipster from Brooklyn. The accordionist was the first to sing. A bray of syllables erupted from him like an exorcism. A steady drumbeat followed and then the women commanded the singing. Their vocals ranged from yodels to yips, whoops to whispers. At first turbulence reigned, as if the women were singing different songs at each other. But soon their voices blended into a melody that curled like a river.
Arriving a symbolic and symmetric 27 years after he died at the age of 27, a "new" Nirvana song has been released. What makes "Drowned In The Sun" very different to "'You Know You're Right" – the last track Nirvana recorded in 1994 but which was not released until 2002 – is that Kurt Cobain did not write it and no members of Nirvana played on it. The track in question was created using artificial intelligence (AI) software that analyzed a number of Nirvana tracks in order to mimic their writing, recording and lyrical styles – drawing on vocals by Eric Hogan, lead singer in Nevermind, a Nirvana tribute act. Such digital necromancy comes with a whole host of moral, ethical and musical concerns, but in this case it is part of the Lost Tapes Of The 27 Club project raising awareness of mental health issues in music. The 27 Club refers to that mythologized grouping of musicians who all died at the age of 27.
Yesterday (5th) marked the twenty-seventh anniversary of Nirvana frontman/guitarist Kurt Cobain's death. He was twenty-seven years old. A project called Lost Tapes of the 27 Club released the "new" Nirvana song titled "Drowned In The Sun." One could assume the song was released after discovering old recordings. However, the track was written by Artificial Intelligence and was released to raise awareness of mental health issues in the music industry.
Fans of Nirvana may do a double-take when they hear'Drowned in the Sun,' a new song created by artificial intelligence that simulates the songwriting of late grunge legend Kurt Cobain. Engineers fed Nirvana's back catalog to Google's AI program, Magenta, which analyzed it for recurring components and then developed an entirely new track. The voice on'Drowned in the Sun,' is 100 percent human, though--provided by Eric Hogan, lead singer of the Atlanta Nirvana cover band Nevermind. The song is just one release from The Lost Tapes of the 27 Club, a project developed by the nonprofit Over the Bridge, which spotlights mental health issues in the music industry. Other AI-generated'lost' tracks have taken their cue from Jim Morrison, Jimi Hendrix and Amy Winehouse, who, like Cobain, died at age 27.
So, you recently bought or were gifted an Echo, Echo Dot, or another Echo device, and it's sitting in your kitchen, silently awaiting your next order. Before you can ask your Alexa-powered Echo to play your favorite Spotify playlist or to turn on your living room lights, you'll need to tweak a few key settings. Get the scoop on how to train Alexa to recognize your voice, keep her from letting just anyone buy stuff on Amazon, tell her where you live and work, and more. As soon as your new Echo is up and running, Alexa can start answering your questions and doing your bidding. That said, it's a good idea to help Alexa get accustomed to your voice as soon as possible.
If it ain't broke, don't fix it. That seems to be the idea behind the second-gen Google Nest Hub, which--at first glance, anyway--could easily be mistaken for the original. This new Nest Hub keeps the first's slim, fabric-covered base and "floating" seven-inch display, along with onboard Google Assistant, an attractive, easy-to-use interface, plenty of video, music, and other entertainment options, and some of the best home automation functionality you'll find in a smart display. Sounds like a yawner, right? Well, that's literally true when it comes to the Nest Hub deux's major new feature: Sleep Sensing, an opt-in feature that allows the display to monitor your sleep without the need for a wristband. Powered by a tiny, built-in radar, the Nest Hub can actually sense your breathing as you slumber, and it'll give you detailed reports about your sleep history and quality.
Art has long been considered the exclusive domain of human creativity. But turns out machines can do a lot more in the creative realm than we humans can imagine. In October 2018, Christie's sold first AI-generated painting for $432,500. Titled Edmond de Belamy, the artwork was expected to sell for $10,000. Obvious art created this masterpiece using Generative Adversarial Network (GAN) algorithm by feeding the system with 15,000 portraits created between the 14th and 20th century.