Goto

Collaborating Authors

 voice engine


21-year-old whose speech was impaired by tumor has voice replicated through AI smartphone app

FOX News

WEHEAD connects to ChatGPT and displays a face, expressions and voice. The voice Alexis "Lexi" Bogan had before last summer was exuberant. She loved to belt out Taylor Swift and Zach Bryan ballads in the car. She laughed all the time -- even while corralling misbehaving preschoolers or debating politics with friends over a backyard fire pit. In high school, she was a soprano in the chorus.


OpenAI debuts voice cloning tool, but deems it too risky for public release

Al Jazeera

OpenAI has unveiled a tool for cloning people's voices but is holding back on its public release due to concerns about possible misuse in a key election year. Voice Engine can replicate a person's voice based on a 15-second audio sample, according to an OpenAI blog post demonstrating the tool. But the ChatGPT creator is "taking a cautious and informed approach" to the technology and hopes to start a dialogue on "the responsible deployment of synthetic voices", the company said in the blog post published on Friday. "We recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year," the San Francisco-based start-up said. "We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build."


OpenAI Can Re-Create Human Voices--but Won't Release the Tech Yet

WIRED

Voice synthesis has come a long way since 1978's Speak & Spell toy, which once wowed people with its state-of-the-art ability to read words aloud using an electronic voice. Now, using deep-learning AI models, software can create not only realistic-sounding voices but can also convincingly imitate existing voices using small samples of audio. Along those lines, OpenAI this week announced Voice Engine, a text-to-speech AI model for creating synthetic voices based on a 15-second segment of recorded audio. It has provided audio samples of the Voice Engine in action on its website. This story originally appeared on Ars Technica, a trusted source for technology news, tech policy analysis, reviews, and more.


OpenAI says it can clone a voice from just 15 seconds of audio

Engadget

OpenAI just announced that it recently conducted a small-scale preview of a new tool called Voice Engine. This is a voice cloning technology that can mimic any speaker by analyzing a 15-second audio sample. The company says it generates "natural-sounding speech" with "emotive and realistic voices." The technology is based on the company's pre-existing text-to-speech API and it has been in the works since 2022. OpenAI has already been using a version of the toolset to power the preset voices available in the current text-to-speech API and the Read Aloud feature. There are a bunch of samples on the company's official blog and they sound eerily close to the real thing.


Val Kilmer's Top Gun: Maverick dialog was all AI since he can no longer speak

#artificialintelligence

Top Gun: Maverick has proven to be a massive success for Tom Cruise, Paramount Pictures, and everyone involved. If you've watched the movie by now, you'll probably agree with most of us that it's an excellent follow-up to the original film from 1986. What you might not know is that Val Kilmer's voice in the movie was brought to life with voice AI. When the original Top Gun was released in 1986, Val Kilmer and Tom Cruise's chemistry on-screen as Iceman and Maverick was an instant hit. Revisiting that story without Kilmer's Iceman would have been disappointing for many fans and even for Kilmer himself.


Why Custom Language Models (CLMs) are Needed in Speech Recognition for Kids

#artificialintelligence

Welcome back to "Lessons from Our Voice Engine," where members of our Engineering and Speech Tech teams offer high level insights into how our voice engine works. Lesson 2 is from Lora Lynn Asvos, a Computational Linguist on our Speech Tech team. CLM stands for "custom language model." As mentioned in Lesson 1, language models are statistical models of language that can predict the next word based on the context. CLMs are language models, as the name implies, but they have a little something extra.


AI voice actors sound more human than ever--and they're ready to hire

MIT Technology Review

WellSaid Labs describes what clients can expect from its "eight new digital voice actors!" Tobin is "energetic and insightful." Paige is "poised and expressive." Ava is "polished, self-assured, and professional." Each one is based on a real voice actor, whose likeness (with consent) has been preserved using AI.