gustafson
Wisconsin lawmakers weigh crackdowns on AI-generated political ads, child porn
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Wisconsin lawmakers were set to vote Thursday on proposals to regulate artificial intelligence, joining a growing number of states grappling with how to control the technology as November's elections loom. The Assembly was scheduled to vote on a bipartisan measure to require political candidates and groups to include disclaimers in ads that use AI technology. Violators would face a 1,000 fine.
- North America > United States > Texas (0.07)
- North America > United States > New Hampshire (0.07)
- North America > Puerto Rico (0.07)
- (6 more...)
A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS
Wang, Siyang, Henter, Gustav Eje, Gustafson, Joakim, Székely, Éva
Recent work has explored using self-supervised learning (SSL) speech representations such as wav2vec2.0 as the representation medium in standard two-stage TTS, in place of conventionally used mel-spectrograms. It is however unclear which speech SSL is the better fit for TTS, and whether or not the performance differs between read and spontaneous TTS, the later of which is arguably more challenging. This study aims at addressing these questions by testing several speech SSLs, including different layers of the same SSL, in two-stage TTS on both read and spontaneous corpora, while maintaining constant TTS model architecture and training settings. Results from listening tests show that the 9th layer of 12-layer wav2vec2.0 (ASR finetuned) outperforms other tested SSLs and mel-spectrogram, in both read and spontaneous TTS. Our work sheds light on both how speech SSL can readily improve current TTS systems, and how SSLs compare in the challenging generative task of TTS. Audio examples can be found at https://www.speech.kth.se/tts-demos/ssr_tts
Prosody-controllable spontaneous TTS with neural HMMs
Lameris, Harm, Mehta, Shivam, Henter, Gustav Eje, Gustafson, Joakim, Székely, Éva
Spontaneous speech has many affective and pragmatic functions that are interesting and challenging to model in TTS. However, the presence of reduced articulation, fillers, repetitions, and other disfluencies in spontaneous speech make the text and acoustics less aligned than in read speech, which is problematic for attention-based TTS. We propose a TTS architecture that can rapidly learn to speak from small and irregular datasets, while also reproducing the diversity of expressive phenomena present in spontaneous speech. Specifically, we add utterance-level prosody control to an existing neural HMM-based TTS system which is capable of stable, monotonic alignments for spontaneous speech. We objectively evaluate control accuracy and perform perceptual tests that demonstrate that prosody control does not degrade synthesis quality. To exemplify the power of combining prosody control and ecologically valid data for reproducing intricate spontaneous speech phenomena, we evaluate the system's capability of synthesizing two types of creaky voice. Audio samples are available at https://www.speech.kth.se/tts-demos/prosodic-hmm/
em Jingle Jangle /em Is the Holiday Hallucination This Season Demands
In a year where Radio City is shuttered by the pandemic, Netflix's Jingle Jangle: A Christmas Journey is the closest thing we have to a big Christmas spectacular. It's a cross between The Greatest Showman and Cats, bundled in shiny Christmas wrapping paper, with none other than John Legend as one of the minds behind the many songs packed into its two hours. At the center of that festive mishmash is an inventor named Jeronicus Jangle, a sentient doll, and a robot that, like Tinkerbell, is powered by belief. However, unlike Cats, David E. Talbert's movie is a coherent, compelling story that doesn't require booze or any other form of pre-gaming to be fully enjoyed. It's the filmic equivalent of the high that comes from eating way too many candy canes and drinking way too much hot chocolate.
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Media > Television (0.73)
Three Key Success Factors for Achieving Maximum Business Value with AI
Artificial intelligence (AI) is making tremendous impact already in the world. Some of that impact is indirect and comes from the anticipation of things that may sound like science fiction now, which has led to massive investments in ideas and people. Much of the impact is direct and comes from applying existing AI capabilities to current processes to improve customer satisfaction, decision making, and productivity of people and supply chains. In both cases, there is still a lot of confusion about what AI is and what it takes to use. Firstly, to leverage AI successfully, you need to be able to measure the system or process you hope to improve or create with AI. Without good measurement, it will be difficult to understand the return on investment (ROI) from AI, and it is possible that your solutions is not as good as it could be because that measurement is not driving improvements in the AI technology.
Consider Indirect Threats of AI, Too
Alan Bundy's Viewpoint "Smart Machines Are Not a Threat to Humanity" (Feb. Reducing the entire field of AI to four "successful AI systems"--DeepBlue, Tartan Racing, Watson, and AlphaGo--does not give the full picture of the impact of AI on humanity. Recent advances in pattern recognition, due mainly to deep learning, for computer vision and speech recognition have achieved benchmarks comparable to human performance;2 consider AI technologies power surveillance systems, as well as Apple's Siri and Amazon's Echo personal assistants. Looking at such AI algorithms one can imagine AI general intelligence being possible throughout our communication networks, computer interfaces, and tens of millions of Internet of Things devices in the near future. Toward this end, Deepmind Technologies Ltd. (acquired by Google in 2014) created a game-playing program combining deep learning and reinforcement learning that sees the board, as well as moves the pieces on the board.1 Recent advances in generative adversarial learning will reduce reliance on labeled data (and the humans who do the labeling) toward machine-learning software capable of self-improvement.
- North America > United States > California > San Francisco County > San Francisco (0.15)
- North America > United States > New York (0.05)
- North America > United States > Virginia > Alexandria County > Alexandria (0.05)
- North America > United States > Rhode Island > Providence County > Providence (0.05)
- Leisure & Entertainment > Games (0.55)
- Information Technology > Security & Privacy (0.49)
- Information Technology > Smart Houses & Appliances (0.35)
Profile of a Winner: Kansas State University
Second, team's software was able to find, recognize, Because the camera and the arm are on about 200 pounds. An edge-detection algorithm equipped with 2 rings of 16 sonar sensors. The camera was calibrated system is used on board the robot. Positioned to Pick Up every time it was moved. When the robot was trying edge in 3D space relative to the robot.
The Find-the-Remote Event
In real life, such functions range of objects along with the perceptual might be useful for in-home care of the capabilities required to support it. The rules specified a fixed course and a fixed set This event was extremely difficult because it of objects that would populate it. The course forced teams to implement both manipulation consisted of typical household furniture and (the grasping and moving of objects) and visual Lexan partitions arranged to produce a simplified object recognition. The objects were typical required teams to implement them for a wide household objects, such as a television remote, range of objects. It therefore eliminated a a pill bottle, and fruits and vegetables.
- North America > United States > Kansas (0.06)
- North America > United States > Rhode Island (0.04)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts (0.04)