TL;DR: A lifetime subscription to TexTalky AI Text-to-Speech is on sale for £28.08, saving you 93% on list price. From marketing content and video narration to customer support and tutorials, there are many instances in today's marketplace when a professional human voice is needed. But due to time constraints, lack of proper recording equipment, or simply the fact you hate your voice, you may turn to a text-to-speech software. Sometimes the robotic voices from these apps leave a lot to be desired. TexTalky AI Text-to-Speech aims to convert your text to lifelike human voices in just a few seconds.
TL;DR: As of Feb. 11, you can slash 37% off this NEWYES Scan Reader Pen 3 Text-to-Speech OCR Multilingual Translator and get it for $124.99 instead of $199. If you are studying a second language, taking lots of notes for work or school, struggle with written text, or just want an easier way to get through the stack of books on your nightstand, there are tools that can help you out. One that's making its mark -- and happens to be on sale -- is the NEWYES Scan Reader Pen 3. The NEWYES Scan opens up new possibilities for learning. You can use it to read and retain information, translate words and phrases, look up words on the spot, capture quotes and transfer them to your computer, or even record audio to review later. This text-to-speech reader pen recognizes 3,000 characters per minute and translates in 0.3 seconds with 98 percent accuracy.
In cross-lingual speech synthesis, the speech in various languages can be synthesized for a monoglot speaker. Normally, only the data of monoglot speakers are available for model training, thus the speaker similarity is relatively low between the synthesized cross-lingual speech and the native language recordings. Based on the multilingual transformer text-to-speech model, this paper studies a multi-task learning framework to improve the cross-lingual speaker similarity. To further improve the speaker similarity, joint training with a speaker classifier is proposed. Here, a scheme similar to parallel scheduled sampling is proposed to train the transformer model efficiently to avoid breaking the parallel training mechanism when introducing joint training. By using multi-task learning and speaker classifier joint training, in subjective and objective evaluations, the cross-lingual speaker similarity can be consistently improved for both the seen and unseen speakers in the training set.
OrCam MyEye PRO is a wearable assistive technology device for people who are blind, visually impaired or have reading challenges. It's lightweight, finger-size and magnetically mounts on eyeglass frames. The device instantly reads aloud any printed text (books, menus, signs) and digital screens (computer, smartphone), recognizes faces, and identifies products/bar codes, money notes and colors – all in real time and offline. The interactive Smart Reading feature enables users to tailor their assistive reading experience, and Orientation assists with guidance and identification of objects. Newly released "Hey OrCam" enables control of all device features and settings hands-free, using voice commands.
Most neural text-to-speech (TTS) models require
Neural text-to-speech (TTS) models are successfully used to generate high-quality human-like speech. However, most TTS models can be trained if only the transcribed data of the desired speaker is given. That means that long-form untranscribed data, such as podcasts, cannot be used to train existing models. A recent paper on arXiv proposes an unconditional diffusion-based generative model. It is trained on untranscribed data that leverages a phoneme classifier for text-to-speech synthesis.
A text-to-speech TikTok voice made by Disney that made users sound like Rocket Raccoon does not allow users to'say' words like "gay", "lesbian", or "queer". Numerous posts by users showed the feature failing to say the LGBTQ terms before it was quietly changed to allow the words. Words like "bisexual" and "transgender", were allowed by the feature. Originally, Rocket's voice would skip over the words when written normally but would be pronounced phonetically if a user wrote "qweer", for example. Attempts to make it read text that contained only the seemingly-prohibited words resulted in an error message saying that text-to-speech was not supported by the language chosen.