broadcaster
DiM-Gestor: Co-Speech Gesture Generation with Adaptive Layer Normalization Mamba-2
Zhang, Fan, Zhao, Siyuan, Ji, Naye, Wang, Zhaohan, Wu, Jingmei, Gao, Fuxing, Ye, Zhenqing, Yan, Leyao, Dai, Lanxin, Geng, Weidong, Lyu, Xin, Zhao, Bozuo, Yu, Dingguo, Du, Hui, Hu, Bin
Speech-driven gesture generation using transformer-based generative models represents a rapidly advancing area within virtual human creation. However, existing models face significant challenges due to their quadratic time and space complexities, limiting scalability and efficiency. To address these limitations, we introduce DiM-Gestor, an innovative end-to-end generative model leveraging the Mamba-2 architecture. DiM-Gestor features a dual-component framework: (1) a fuzzy feature extractor and (2) a speech-to-gesture mapping module, both built on the Mamba-2. The fuzzy feature extractor, integrated with a Chinese Pre-trained Model and Mamba-2, autonomously extracts implicit, continuous speech features. These features are synthesized into a unified latent representation and then processed by the speech-to-gesture mapping module. This module employs an Adaptive Layer Normalization (AdaLN)-enhanced Mamba-2 mechanism to uniformly apply transformations across all sequence tokens. This enables precise modeling of the nuanced interplay between speech features and gesture dynamics. We utilize a diffusion model to train and infer diverse gesture outputs. Extensive subjective and objective evaluations conducted on the newly released Chinese Co-Speech Gestures dataset corroborate the efficacy of our proposed model. Compared with Transformer-based architecture, the assessments reveal that our approach delivers competitive results and significantly reduces memory usage, approximately 2.4 times, and enhances inference speeds by 2 to 4 times. Additionally, we released the CCG dataset, a Chinese Co-Speech Gestures dataset, comprising 15.97 hours (six styles across five scenarios) of 3D full-body skeleton gesture motion performed by professional Chinese TV broadcasters.
- Asia > China > Zhejiang Province (0.04)
- Asia > Macao (0.04)
- Europe > Norway > Western Norway > Vestland > Bergen (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Analyzing Transformers in Embedding Space
Dar, Guy, Geva, Mor, Gupta, Ankit, Berant, Jonathan
Understanding Transformer-based models has attracted significant attention, as they lie at the heart of recent technological advances across machine learning. While most interpretability methods rely on running models over inputs, recent work has shown that a zero-pass approach, where parameters are interpreted directly without a forward/backward pass is feasible for some Transformer parameters, and for two-layer attention networks. In this work, we present a theoretical analysis where all parameters of a trained Transformer are interpreted by projecting them into the embedding space, that is, the space of vocabulary items they operate on. We derive a simple theoretical framework to support our arguments and provide ample evidence for its validity. First, an empirical analysis showing that parameters of both pretrained and fine-tuned models can be interpreted in embedding space. Second, we present two applications of our framework: (a) aligning the parameters of different models that share a vocabulary, and (b) constructing a classifier without training by ``translating'' the parameters of a fine-tuned classifier to parameters of a different model that was only pretrained. Overall, our findings open the door to interpretation methods that, at least in part, abstract away from model specifics and operate in the embedding space only.
- Oceania > Australia > Australian Capital Territory > Canberra (0.05)
- Asia > India > Maharashtra > Mumbai (0.05)
- North America > Canada > Manitoba > Winnipeg Metropolitan Region > Winnipeg (0.05)
- (47 more...)
- Media > Film (1.00)
- Government (1.00)
- Health & Medicine (0.93)
- (2 more...)
Wimbledon's AI Announcer Was Inevitable
The Wimbledon announcer sounds a little like Helen Mirren if she'd just been hit with a polo mallet. I'm watching match highlights between Ons Jabeur and Magdalena Fręch on the tournament's website when a voice says, "Jabeur, from Tunisia, will play Fręch, from Poland, on the renowned No. 1 court in the first round." Fręch is mispronounced, as is Tunisia, and the word renowned is used oddly dispassionately, as if it were being repeated for a competitor at a spelling bee. This is a commentary chatbot, introduced with considerable fanfare at the All England Club this year. Another version, a "male" voice, sounds like your uncle from Queens trying to do a Hugh Grant impression.
- Europe > United Kingdom > England > Greater London > London > Wimbledon (0.66)
- Africa > Middle East > Tunisia (0.45)
- Europe > Poland (0.25)
- (2 more...)
- Leisure & Entertainment > Sports > Tennis (1.00)
- Leisure & Entertainment > Sports > Football (0.71)
World's first AI news anchor unveiled in China
China's state news agency Xinhua this week introduced the newest members of its newsroom: AI anchors who will report "tirelessly" all day every day, from anywhere in the country. Chinese viewers were greeted with a digital version of a regular Xinhua news anchor named Qiu Hao. The anchor, wearing a red tie and pin-striped suit, nods his head in emphasis, blinking and raising his eyebrows slightly. "Not only can I accompany you 24 hours a day, 365 days a year. I can be endlessly copied and present at different scenes to bring you the news," he says.
John Madden returning to cover of Madden NFL 23 video game
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. For the first time in two decades, late football legend John Madden will grace the cover of a Madden NFL video game. EA Sports on Wednesday announced that the Hall of Fame coach, who died in December, will appear on the cover of all three editions of this year's Madden NFL 23 video game. The covers will include him in different parts of his life, including as a coach and as a broadcaster.
- North America > United States > Minnesota (0.07)
- North America > United States > Michigan > Wayne County > Detroit (0.05)
- North America > United States > California > Los Angeles County > Pasadena (0.05)
- Leisure & Entertainment > Sports > Football (1.00)
- Leisure & Entertainment > Games > Computer Games (1.00)
The Future of Robot Nannies
Childcare is the most intimate of activities. Evolution has generated drives so powerful that we will risk our lives to protect not only our own children, but quite often any child, and even the young of other species. Robots, by contrast, are products created by commercial entities with commercial goals, which may--and should--include the well-being of their customers, but will never be limited to such. Robots, corporations, and other legal or non-legal entities do not possess the instinctual nature of humans to care for the young--even if our anthropomorphic tendencies may prompt some children and adults to overlook this fact. If you buy something using links in our stories, we may earn a commission.
Five Leading Sports Analytics Software Programs
Sports analytics is one of the biggest sectors booming in the sports world. Though sports have captured the attention of the public and investors alike, analytics is a behind-the-scenes industry that combines the latest in machine-learning algorithms and data crunching. Some programs, like those that rely on AI, are designed to make predictions by studying huge amounts of historical data. Others, such as analytics software, are designed to make immediate conclusions from live data points. Teams rely on sports analytics to make leaner decisions related to recruitment, training regimens, and more.
'Deepfake' Queen delivers alternative Christmas speech in warning about misinformation
London (CNN)A fake Queen Elizabeth danced across TV screens on Christmas as part of a "deepfake" speech aired by a British broadcaster. The real British monarch traditionally delivers a Christmas Day speech aired around the world. But her speech on Friday at 3 p.m. was followed by a digitally-created fake of the Queen, aired on Channel 4 and voiced by an actor, warning viewers to question "whether what we see and hear is always what it seems." Channel 4 said the video was created as a "stark warning" about technology and the proliferation of fake news. The broadcaster said the video was supposed to offer "a stark warning about the advanced technology that is enabling the proliferation of misinformation and fake news in a digital age."
- Europe > United Kingdom > England > Greater London > London (0.26)
- Asia > Middle East > Iran (0.18)
- Media > News (0.83)
- Information Technology > Security & Privacy (0.74)
- Government > Regional Government > Europe Government > United Kingdom Government (0.43)
Deepfake Queen Elizabeth II will deliver 'alternative' Christmas message
Just about every year since 1952, Queen Elizabeth II of the United Kingdom has delivered a Christmas address to the masses, and 2020 will be no different. Shortly after she gives her remarks, however, British broadcaster Channel 4 will air an "alternative message" from the Queen, brought to life by deepfake software and an actress with a pseudo-regal affect. "On the BBC, I haven't always been able to speak plainly and from the heart," the "Queen" said in a promo posted to the broadcaster's Twitter. "So I'm grateful to Channel 4 for giving me the opportunity to say whatever I like without anyone putting words in my mouth." There's relatively little risk that anyone would look at Channel 4's deepfake and regard it as a genuine message from the Queen.
AI, Cloud Aim to Enhance the U.S. Open Fan Experience
The decision led to sweeping changes to almost every aspect of the competition, from playing matches with electronic line calling to having athletes use food-ordering apps for meal deliveries to their hospitality suites at the Billie Jean King National Tennis Center. However, the absence of fans immediately presented a problem for some of the USTA's recent AI projects. Last year, for instance, the USTA worked with International Business Machines Corp. to introduce a number of AI-powered additions to the tournament, including machine learning algorithms that rapidly compose broadcast highlight reels based on crowd reaction. "June 17th was really a pivotal moment, a lot of the solutions that we had in the pipeline were no longer going to be viable," said Kristi Kolski, marketing program director for IBM's sports and entertainment partnerships unit. "No crowd, no roar, no AI highlights."
- Leisure & Entertainment > Sports (1.00)
- Information Technology (1.00)