clair
The mother of one of Elon Musk's children is suing xAI over nonconsensual deepfake images
How to claim Verizon's $20 outage credit As the standalone Grok app continues to produce sexualized images of real people, Apple and Google continue to host it in their stores. US author Ashley St. Clair, delivers a speech during a convention of the European Parliament leaders of Identity and Democracy group (ID), on December 3, 2023. Although X removed Grok's ability to create nonconsensual digitally undressed images on the social platform, the standalone Grok app is another story. It reportedly continues to produce "nudified" deepfakes of real people. And now, Ashley St. Clair, a conservative political strategist and mother of one of Elon Musk's 14 children, has sued xAI for nonconsensual sexualized images of her that Grok allegedly produced.
- Europe > United Kingdom (0.16)
- North America > United States > California (0.05)
- Asia > Malaysia (0.05)
- Asia > Indonesia (0.05)
- Information Technology > Security & Privacy (0.67)
- Law > Litigation (0.41)
CLAIR-A: Leveraging Large Language Models to Judge Audio Captions
Wu, Tsung-Han, Gonzalez, Joseph E., Darrell, Trevor, Chan, David M.
The Automated Audio Captioning (AAC) task asks models to generate natural language descriptions of an audio input. Evaluating these machine-generated audio captions is a complex task that requires considering diverse factors, among them, auditory scene understanding, sound-object inference, temporal coherence, and the environmental context of the scene. While current methods focus on specific aspects, they often fail to provide an overall score that aligns well with human judgment. In this work, we propose CLAIR-A, a simple and flexible method that leverages the zero-shot capabilities of large language models (LLMs) to evaluate candidate audio captions by directly asking LLMs for a semantic distance score. In our evaluations, CLAIR-A better predicts human judgements of quality compared to traditional metrics, with a 5.8% relative accuracy improvement compared to the domain-specific FENSE metric and up to 11% over the best general-purpose measure on the Clotho-Eval dataset. Moreover, CLAIR-A offers more transparency by allowing the language model to explain the reasoning behind its scores, with these explanations rated up to 30% better by human evaluators than those provided by baseline methods. CLAIR-A is made publicly available at https://github.com/DavidMChan/clair-a.
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
D'Oosterlinck, Karel, Xu, Winnie, Develder, Chris, Demeester, Thomas, Singh, Amanpreet, Potts, Christopher, Kiela, Douwe, Mehri, Shikib
Large Language Models (LLMs) are often aligned using contrastive alignment objectives and preference pair datasets. The interaction between model, paired data, and objective makes alignment a complicated procedure, sometimes producing subpar results. We study this and find that (i) preference data gives a better learning signal when the underlying responses are contrastive, and (ii) alignment objectives lead to better performance when they specify more control over the model during training. Based on these insights, we introduce Contrastive Learning from AI Revisions (CLAIR), a data-creation method which leads to more contrastive preference pairs, and Anchored Preference Optimization (APO), a controllable and more stable alignment objective. We align Llama-3-8B-Instruct using various comparable datasets and alignment objectives and measure MixEval-Hard scores, which correlate highly with human judgments. The CLAIR preferences lead to the strongest performance out of all datasets, and APO consistently outperforms less controllable objectives. Our best model, trained on 32K CLAIR preferences with APO, improves Llama-3-8B-Instruct by 7.65%, closing the gap with GPT4-turbo by 45%. Our code is available at https://github.com/ContextualAI/CLAIR_and_APO.
CLAIR: Evaluating Image Captions with Large Language Models
Chan, David, Petryk, Suzanne, Gonzalez, Joseph E., Darrell, Trevor, Canny, John
The evaluation of machine-generated image captions poses an interesting yet persistent challenge. Effective evaluation measures must consider numerous dimensions of similarity, including semantic relevance, visual structure, object interactions, caption diversity, and specificity. Existing highly-engineered measures attempt to capture specific aspects, but fall short in providing a holistic score that aligns closely with human judgments. Here, we propose CLAIR, a novel method that leverages the zero-shot language modeling capabilities of large language models (LLMs) to evaluate candidate captions. In our evaluations, CLAIR demonstrates a stronger correlation with human judgments of caption quality compared to existing measures. Notably, on Flickr8K-Expert, CLAIR achieves relative correlation improvements over SPICE of 39.6% and over image-augmented methods such as RefCLIP-S of 18.3%. Moreover, CLAIR provides noisily interpretable results by allowing the language model to identify the underlying reasoning behind its assigned score. Code is available at https://davidmchan.github.io/clair/
How A.I. 'hallucinations' helped make Westworld's main titles
Do androids dream of electric sheep? Sure, if the sheep graze in the Westworld theme park. This season, the HBO show's main titles designer, Patrick Clair, recruited an A.I. expert, A.I. Fiction's Dr. Pinar Yanardag, to connect him with that dreaming android, in this case, with a generative adversarial network (GAN). Yanardag and her team had used neural networks for all sorts of purposes -- to create nightmares, horror, graffiti, music, fashion, perfume, cocktails, pizza, and even chocolate. "My mind was blown by the kinds of things she's doing, combining creativity with A.I.," Clair told SYFY WIRE, so he thought, why not let a neural network try television next?
- Leisure & Entertainment (1.00)
- Media > Television (0.69)
Explainable Artificial Intelligence
In the 1980s, Bloom County was arguably the most popular comic strip on the planet. In one sequence, the penguin Opus is running for elected office, and the local computer nerd, Oliver Wendell Jones, uses AI to analyze polling data and determines that the ideal image that voters want in a candidate is "chocolate éclair." The root problem here is that of AI explainability. Had Opus demanded an explanation on why chocolate éclair was the suggestion of the AI, the error would have been discovered. But Opus didn't, for he trusted the system, and the system was wrong.
'America's Got Talent' Season 12: Sara & Hero Reveal New Skills
The "America's Got Talent" Season 12 contestants have certainly learned a thing or two since joining the hit NBC reality TV competition. A few days before the semifinals, some of the contestants shared one important lesson learned from their fellow contestants on the show. Dog act Sara & Hero has wowed the crowd since the judge cuts round. Despite not receiving rave reviews during their first audition, Sara & Hero have made it to the semifinals with flying colors. In the clip released by network, Sara jokingly says that she and her dog, Hero, learned how to become escape artists from Demian Aditya.
- Media > Television (1.00)
- Leisure & Entertainment (1.00)
'Westworld' Spoilers: Show Designer Explains The Opening Title Sequence
The opening title sequence for HBO's new sci-fi series "Westworld" is both mesmerizing and haunting, and it will definitely draw in audiences to the production of robotic hosts. Show designer Patrick Clair told Vulture that opening title sequences are distinct for each show. For "Westworld," he wanted to take an explicit approach and condense the show's own design elements. "By the time I came on the show, they had already created the most beautiful and poetic version of creating robots that I could imagine," he said. "I could have abstracted that, but when I looked at the hosts inside the show, and the beautiful white translucent liquid, I thought the process itself seemed very poetic as it was, so instead of trying to represent that in an abstract way, I wanted to use the same design elements, the same robot arms, the same way the people turn in the circle."
Building Blocks of Social Intelligence: Enabling Autonomy for Socially Intelligent and Assistive Robots
Mead, Ross Alan (University of Southern California) | Atrash, Amin (University of Southern California) | Kaszubski, Edward (University of Southern California) | Clair, Aaron St. (University of Southern California) | Greczek, Jillian (University of Southern California) | Clabaugh, Caitlyn (University of Southern California) | Kohan, Brian (University of Southern California) | Mataric, Maja J. (University of Southern California)
Vocalics is the study of the nonverbal aspects of speech, such as volume, pitch, and rate. Our contribution is a parametric We present an overview of the control, recognition, decision-making, vocalic behavior controller that autonomously adjusts and learning techniques utilized by the Interaction the robot speaker volume based on models of how a Lab (robotics.usc.edu/interaction) at the University human user will hear speech produced by the robot. These of Southern California (USC) to enable autonomy in sociable models vary with distance, orientation, and perceived environmental and socially assistive robots. These techniques are implemented interference (Mead & Matarić 2014). Our future with two software libraries: 1) the Social Behavior work will investigate adapting the pitch and rate of speech Library (SBL) provides autonomous social behavior produced by a robot to improve user speech perception.
- North America > United States > California > Santa Clara County > Stanford (0.05)
- North America > United States > Washington > Whatcom County > Bellingham (0.05)
- North America > United States > Oklahoma > Cleveland County > Norman (0.05)
- (7 more...)
- Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.72)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)