ume
Unifying Diarization, Separation, and ASR with Multi-Speaker Encoder
Shakeel, Muhammad, Sudo, Yui, Peng, Yifan, Lin, Chyi-Jiunn, Watanabe, Shinji
--This paper presents a unified multi-speaker encoder (UME), a novel architecture that jointly learns representations for speaker diarization (SD), speech separation (SS), and multi-speaker automatic speech recognition (ASR) tasks using a shared speech foundational encoder . We leverage the hidden representations from multiple layers of UME as a residual weighted-sum encoding (RWSE) to effectively use information from different semantic levels, contributing to bottom-up alignment between tasks. Our evaluations demonstrate that UME substantially improves over the single-task baselines dedicated to SD, SS, and multi-speaker ASR on LibriMix evaluation sets. Notably, for SD, UME outperforms the previous studies, achieving diarization error rates of 1.37% and 2.29% on Libri2Mix and Libri3Mix evaluation sets, respectively. Speaker diarization (SD), speech separation (SS), and multi-speaker automatic speech recognition (ASR) are tasks of great importance that aim to comprehend and answer the question "who spoke what and when," with applications to transcribing meetings and interviews, among others.
AI Deepfakes and the Future of Truth
When several life-like Tom Cruise deepfakes went viral on TikTok, many saw the future of truth through a glass, darkly -- out of concern for a world where acquiring deepfakes of major celebrities or political figures would become a "one-click" feature of daily life. Like it or not, we live in a world where anyone can interact with deepfake technology. But curating high-end specialized AI drivers -- whether for mischief or raising awareness -- is harder than it looks. The creator of the video -- a Belgium VFX specialist named Chris Ume -- thinks this is unlikely, emphasizing the impractically long timespans and substantial effort required to build every deepfake, in addition to finding an ace Tom Cruise impersonator (Miles Fisher). "You can't do it by just pressing a button," said Ume in a report from The Verge.
Slick Tom Cruise Deepfakes Signal That Near Flawless Forgeries May Be Here
Hany Farid, a digital forensics expert at UC Berkeley, says the dangers in sophisticated phony videos called "deepfakes" are amplified in their potential to travel rapidly across social media. Hany Farid, a digital forensics expert at UC Berkeley, says the dangers in sophisticated phony videos called "deepfakes" are amplified in their potential to travel rapidly across social media. The videos, uploaded to TikTok in recent weeks by the account @deeptomcruise, have raised new fears over the proliferation of convincing deepfakes -- the nickname for media generated by artificial intelligence technology showing phony events that often seem realistic enough to dupe an audience. Hany Farid, a professor at the University of California, Berkeley, told NPR's All Things Considered that the Cruise videos demonstrate a step up in the technology's evolving sophistication. "This is clearly a new category of deepfake that we have not seen before," said Farid, who researches digital forensics and misinformation.
Be (Very) Worried about the Tom Cruise Deepfakes - Shelly Palmer
After creating a convincing viral series of Tom Cruise deepfakes on TikTok, VFX specialist Chris Ume told The Verge, "You can't do it by just pressing a button. That's important, that's a message I want to tell people." He went on to say that each clip took weeks of work using the open-source DeepFaceLab algorithm as well as established video editing tools. The key takeaway, and the title of the article, was Tom Cruise deepfake creator says public shouldn't be worried about'one-click fakes'. You should be extremely worried about deepfakes, the technology that empowers their creation, and the exponential speed of innovation.
'I don't want to upset people': Tom Cruise deepfake creator speaks out
Joining TikTok has become something of a trend for Hollywood celebrities stuck at home like everyone else. So it wasn't necessarily surprising to see Tom Cruise on the app, sharing videos of himself playing golf and pratfalling around the house. But the strange thing is that Cruise never actually made the videos. And the account that posted them, DeepTomCruise, wore that on its sleeve: it was openly the work of a talented creator of "deepfakes", AI-generated video clips that use a variety of techniques to create situations that have never happened in the real world. Despite being open about its falseness, the account's videos are so realistic that they still prompted wild speculation.