pronounce
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Polyphone disambiguation aims to capture accurate pronunciation knowledge from natural text sequences for reliable Text-to-speech (TTS) systems. However, previous approaches require substantial annotated training data and additional efforts from language experts, making it difficult to extend high-quality neural TTS systems to out-of-domain daily conversations and countless languages worldwide. This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary (the existing prior information in the natural language). Specifically, we design a semantics-to-pronunciation attention (S2PA) module to match the semantic patterns between the input text sequence and the prior semantics in the dictionary and obtain the corresponding pronunciations; The S2PA module can be easily trained with the end-to-end TTS model without any annotated phoneme labels. Experimental results in three languages show that our model outperforms several strong baseline models in terms of pronunciation accuracy and improves the prosody modeling of TTS systems. Further extensive analyses demonstrate that each design in Dict-TTS is effective.
AI and the End of Accents
I sound Korean--because I am Korean. Can AI make me sound American? It all began, as these things often do, with an Instagram ad . "No one tells you this if you're an immigrant, but accent discrimination is a real thing," said a woman in the video. Her own accent is faintly Eastern European--so subtle it took me a few playbacks to notice.
- Asia > China (0.16)
- North America > United States > Ohio (0.05)
- North America > United States > New York (0.05)
- (8 more...)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
- Information Technology > Communications > Social Media (0.71)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Polyphone disambiguation aims to capture accurate pronunciation knowledge from natural text sequences for reliable Text-to-speech (TTS) systems. However, previous approaches require substantial annotated training data and additional efforts from language experts, making it difficult to extend high-quality neural TTS systems to out-of-domain daily conversations and countless languages worldwide. This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary (the existing prior information in the natural language). Specifically, we design a semantics-to-pronunciation attention (S2PA) module to match the semantic patterns between the input text sequence and the prior semantics in the dictionary and obtain the corresponding pronunciations; The S2PA module can be easily trained with the end-to-end TTS model without any annotated phoneme labels. Experimental results in three languages show that our model outperforms several strong baseline models in terms of pronunciation accuracy and improves the prosody modeling of TTS systems. Further extensive analyses demonstrate that each design in Dict-TTS is effective.
Does 'scone' rhyme with 'gone' or 'cone'? MailOnline asks ChatGPT how to pronounce it
With the King's coronation happening tomorrow, millions of Britons across the UK will be getting their celebration picnics ready. No decent spread would be complete without scones slathered in clotted cream and jam, but the big question is - how do you pronounce'scone'? While many people argue that the baked good should rhyme with'cone', others are convinced that it should rhyme with'gone'. To settle the debate once and for all, MailOnline turned to everyone's favourite AI bot, ChatGPT. But do you agree with its claims?
- Europe > United Kingdom > Scotland (0.07)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.07)
- Europe > United Kingdom > Northern Ireland (0.06)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.06)
Amazon Alexa settles the 'scone' debate for the Queen's Platinum Jubilee
With their crumbly texture and smeared with clotted cream and jam, scones are a favourite treat with Brits across the UK. But despite dating back to the early 1500s, one question remains – how do you pronounce the word'scone'? Now, Amazon's smart assistant, Alexa, claims to have settled the debate, just in time for the Queen's Platinum Jubilee celebrations. Alexa claims'scone' should rhyme with'gone' rather than'own' when speaking the Queen's English. Users just need to say'Alexa, what's the correct way to pronounce scone?' to get the response: 'I pronounce it scone, to rhyme with gone, just like the Queen does.'
- Europe > United Kingdom > Scotland (0.06)
- Europe > United Kingdom > Northern Ireland (0.05)
- Europe > United Kingdom > England > Greater London > London (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Who Had the Best--and Worst--Italian Accent in em House of Gucci /em ? A Dialect Coach Dishes About Lady Gaga, Jared Leto, and More.
House of Gucci, in theaters this week, is ostensibly a drama about the family behind the Italian fashion house, but it is soon clear what the movie is really about: accents. It's a showcase for stars like Lady Gaga, Adam Driver, and Al Pacino to test-drive their Italian and Italian-accented English, and critical reactions have been mixed, to say the least: Lady Gaga was slammed by one expert for sounding more Russian than Italian, and Jared Leto earned comparisons to a certain cartoon plumber. How fair is all this grousing? To get an expert's perspective on the matter, Slate spoke to Garrett Strommen, who runs a Los Angeles company that offers language lessons and dialect coaching, among other services. Strommen has worked as a dialect coach and consultant for TV, movies, commercials, video games, and more, and agreed to explain exactly what is going on with Gaga and Leto in House of Gucci. Our conversation has been edited and condensed. Heather Schwedel: Can you tell me about your background with Italian?
- North America > United States > California > Los Angeles County > Los Angeles (0.24)
- Europe > Italy (0.05)
- North America > United States > New Jersey (0.04)
- Africa (0.04)
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Media > Music (0.82)
Fluent: An AI Augmented Writing Tool for People who Stutter - Technology Org
Stuttering is a disorder that negatively affects personal and professional life. One of the factors which may impact the likelihood of stuttering is phonological patterns. Some words are more prone to cause stuttering than others, and people who stutter (PWS) can identify which words they might struggle with and then think of a way to manage. Recent advancements in AI, such as phonetic embeddings, can help to simplify these processes. Therefore, a recent paper presents a novel machine-in-the-loop writing tool for assisting PWS with writing scripts, which minimize the number of stuttering events.
Fluent: An AI Augmented Writing Tool for People who Stutter
Stuttering is a speech disorder which impacts the personal and professional lives of millions of people worldwide. To save themselves from stigma and discrimination, people who stutter (PWS) may adopt different strategies to conceal their stuttering. One of the common strategies is word substitution where an individual avoids saying a word they might stutter on and use an alternative instead. This process itself can cause stress and add more burden. In this work, we present Fluent, an AI augmented writing tool which assists PWS in writing scripts which they can speak more fluently. Fluent embodies a novel active learning based method of identifying words an individual might struggle pronouncing. Such words are highlighted in the interface. On hovering over any such word, Fluent presents a set of alternative words which have similar meaning but are easier to speak. The user is free to accept or ignore these suggestions. Based on such user interaction (feedback), Fluent continuously evolves its classifier to better suit the personalized needs of each user. We evaluated our tool by measuring its ability to identify difficult words for 10 simulated users. We found that our tool can identify difficult words with a mean accuracy of over 80% in under 20 interactions and it keeps improving with more feedback. Our tool can be beneficial for certain important life situations like giving a talk, presentation, etc. The source code for this tool has been made publicly accessible at github.com/bhavyaghai/Fluent.
- North America > United States > New York > Suffolk County > Stony Brook (0.05)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- (3 more...)
- Information Technology > Human Computer Interaction (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Communications (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)
The Best Sci-Fi Comedy Is Existential
Tom Gerencer's book Intergalactic Refrigerator Repairmen Seldom Carry Cash features 19 pieces of humorous science fiction. Gerencer selected the stories out of literally hundreds that he's written over the past two decades. "If you go to Walmart, and you go into the section with the big Tupperware bins that you can put clothes and stuff in, I would just write and write and write, and fill a notebook with short stories--or fragments of short stories--and then I would put them into the bin, and then I would fill another notebook and put that in the bin, and fill another notebook, and now I have five or six bins in the basement, and there are several bins that I lost at some point," Gerencer says in Episode 473 of the Geek's Guide to the Galaxy podcast. "It is certainly an avalanche of words." With titles like "Trailer Trash Savior" and "Apocalyptic Nostrils of the Moon," you might expect the stories to be light-hearted, but Gerencer's work also contains a dark streak of existential angst, frequently dealing with questions such as: How can we be happy?
Does Google Assistant always say your name wrong? You can teach it to pronounce correctly
Does Google Assistant always say your name wrong, or maybe the names of people you know? You can soon teach the digital assistant how to pronounce them correctly. Google announced an update rolling out soon to Assistant, available on smartphones and Google Home speakers, that will allow users to teach it how to properly pronounce your name or those in your contacts. Google said the feature will initially be available in English but will roll out to offer more languages soon. "Names matter, and it's frustrating when you're trying to send a text or make a call and Google Assistant mispronounces or simply doesn't recognize a contact," said Yury Pinsky, director of product management at Google, in a blog post published Wednesday.