voiceover
Google's new AI video generator is more HR than Hollywood
For most of us, creating documents, spreadsheets and slide decks is an inescapable part of work life in 2024. What's not is creating videos. That's something Google would like to change. On Tuesday, the company announced Google Vids, a video creation app for work that the company says can make everyone a "great storyteller" using the power of AI. Vids uses Gemini, Google's latest AI model, to quickly create videos for the workplace.
AXNav: Replaying Accessibility Tests from Natural Language
Taeb, Maryam, Swearngin, Amanda, Schoop, Eldon, Cheng, Ruijia, Jiang, Yue, Nichols, Jeffrey
Developers and quality assurance testers often rely on manual testing to test accessibility features throughout the product lifecycle. Unfortunately, manual testing can be tedious, often has an overwhelming scope, and can be difficult to schedule amongst other development milestones. Recently, Large Language Models (LLMs) have been used for a variety of tasks including automation of UIs, however to our knowledge no one has yet explored their use in controlling assistive technologies for the purposes of supporting accessibility testing. In this paper, we explore the requirements of a natural language based accessibility testing workflow, starting with a formative study. From this we build a system that takes as input a manual accessibility test (e.g., ``Search for a show in VoiceOver'') and uses an LLM combined with pixel-based UI Understanding models to execute the test and produce a chaptered, navigable video. In each video, to help QA testers we apply heuristics to detect and flag accessibility issues (e.g., Text size not increasing with Large Text enabled, VoiceOver navigation loops). We evaluate this system through a 10 participant user study with accessibility QA professionals who indicated that the tool would be very useful in their current work and performed tests similarly to how they would manually test the features. The study also reveals insights for future work on using LLMs for accessibility testing.
Your next car salespersons could be an AI bot and selling vehicles in just 18 months as ChatGPT technology advances
The next time you buy a car, it might not be from your standard dealership - it could be from an AI bot. The prediction comes from Johan Sundstrand, the CEO of the Swedish video-tech company Phyron - he believes the change could happen as soon as 2025. He said: 'It's only a matter of time before artificial intelligence (AI) is selling cars as effectively as a human salesperson. 'The speed at which self-learning software is developing and being embraced by retailers means that a fully competent AI-powered sales bot is as close as 18 months away.' Phyron is a Swedish video-tech company that have been developing the world's first fully automated AI-enhanced video solution for the automotive industry Phyron is a Swedish video-tech company that have been developing the world's first fully automated AI-enhanced video solution for the automotive industry. The unique AI software and its algorithms enable Phyron to create videos for car advertisements which can be used on brand or retailer websites, across social media channels and targeted email distribution.
Improving TTS for Shanghainese: Addressing Tone Sandhi via Word Segmentation
Tone is a crucial component of the prosody of Shanghainese, a Wu Chinese variety spoken primarily in urban Shanghai. Tone sandhi, which applies to all multi-syllabic words in Shanghainese, then, is key to natural-sounding speech. Unfortunately, recent work on Shanghainese TTS (text-to-speech) such as Apple's VoiceOver has shown poor performance with tone sandhi, especially LD (left-dominant sandhi). Here I show that word segmentation during text preprocessing can improve the quality of tone sandhi production in TTS models. Syllables within the same word are annotated with a special symbol, which serves as a proxy for prosodic information of the domain of LD. Contrary to the common practice of using prosodic annotation mainly for static pauses, this paper demonstrates that prosodic annotation can also be applied to dynamic tonal phenomena. I anticipate this project to be a starting point for bringing formal linguistic accounts of Shanghainese into computational projects. Too long have we been using the Mandarin models to approximate Shanghainese, but it is a different language with its own linguistic features, and its digitisation and revitalisation should be treated as such.
best-ai-video-generators
Video content is a must have for businesses and content creators wanting to compete in this highly visual environment. Reports have shown that more than 80% of online traffic is video traffic, and an increasing amount of people prefer it over other forms of online content like text and images. Most online publishers rely on social networks to reach audiences, and video content provides more organic reach than other types. At the same time, it has traditionally been both time-consuming and costly to produce and disseminate video content. Artificial intelligence (AI) is changing this outlook, making it easier than ever to generate video.
AI Voice Generator: Versatile Text to Speech Software
For years, creating good voice overs meant investing hundreds if not thousands of dollars in hiring voice artists, renting a recording studio to get the script recorded, investing in expensive recording equipment (if you are recording from home), and recruiting or outsourcing the entire project to an audio editor to mix the audio and produce a high-quality voiceover. Not to mention, the valuable hours dedicated to the entire process. Even after all this, the quality of the produced audio file may be subpar. What if there was an alternative to creating studio-quality voiceovers, and that too from the comfort of your own homes? Introducing Murf AI voice generator, which eliminates the entire process of generating voiceovers manually and enables you to quickly produce human-like voiceovers without any specialized hardware or professional. Leveraging advanced AI algorithms and deep learning, the realistic online voice generator tool allows you to convert text into natural-sounding speech, in a matter of just a few minutes.
Verbyl โ Text-to-Speech Converter
Since the dawns of humanity people would gather around the fire and listen to storiesโฆ Only in the last 100 years, we are used to watching stories at the cinema, TV and later on YouTube. VIDEOS without a good VOICEOVER will not convert, will not get you clicks, leads, traffic, or any sales! That's why a VIDEO is not efficient Without A GOOD VOICEOVER That Tells The Actual Story!
This realistic text-to-speech tool is 98% off today
Video and audio have become a necessity in our everyday lives, especially when it comes to marketing a product or brand. When you need to create video and audio content to promote your business, text-to-speech tools can be very useful. Unfortunately, most of these apps have really robotic voices. If you want something that sounds more natural, Speechnow is worth your attention. This AI-powered app lets you turn text into audio in seconds, with 800 different languages and realistic voices to choose from.
Making Mobile Applications Accessible with Machine Learning
At Apple we use machine learning to teach our products to understand the world more as humans do. Of course, understanding the world better means building great assistive experiences. Machine learning can help our products be intelligent and intuitive enough to improve the day-to-day experiences of people living with disabilities. We can build machine-learned features that support a wide range of users including those who are blind or have low vision, those who are deaf or are hard of hearing, those with physical motor limitations, and also support those with cognitive disabilities. Mobile devices and their apps have become ubiquitous.