Goto

Collaborating Authors

 Media


Billy Joel Is One of History's Most Popular Musicians. So Why Do So Many of Us Hate Him?

Slate

I've long believed that the first hugely popular music you realize you hate is in many ways as important a discovery as the first music you realize you love. There's something crucial and formative about the recognition that an artist whose music is beloved by millions makes your skin crawl, not simply in the realization that said music "isn't for you," but in the fierce and irrational conviction that those millions of people are wrong, that sometimes art that's enormously successful is not, in fact, correspondingly good. As misanthropic as that sounds, it's a significant milestone in coming to learn that everyone's taste is (or at least should be) individuated and distinct, and that those distinct tastes are a large part of what makes people attractive, maddening, and above all else interesting to one another. I don't remember exactly when I discovered I hated Billy Joel's music, but it was sometime in middle school, when as a relatively proficient young piano player, I was asked, for the 10th or 100th time, to play "Piano Man." At that age I only vaguely knew the song and hadn't learned how to play it, and for reasons I probably couldn't have articulated, I promptly resolved that I never would.


Netflix uses generative AI in one of its shows for first time

The Guardian

Netflix has used artificial intelligence in one of its TV shows for the first time, in a move the streaming company's boss said would make films and programmes cheaper and of better quality. Ted Sarandos, a co-chief executive of Netflix, said the Argentinian science fiction series El Eternauta (The Eternaut) was the first it had made that involved using generative AI footage. "We remain convinced that AI represents an incredible opportunity to help creators make films and series better, not just cheaper," he told analysts on Thursday after Netflix reported its second-quarter results. He said the series, which follows survivors of a rapid and devastating toxic snowfall, involved Netflix and visual effects (VFX) artists using AI to show a building collapsing in Buenos Aires. "Using AI-powered tools, they were able to achieve an amazing result with remarkable speed and, in fact, that VFX sequence was completed 10 times faster than it could have been completed with traditional VFX tools and workflows," he said.


Netflix boss says AI effects used in show for first time

BBC News

Netflix says it has used visual effects created by generative artificial intelligence (AI) on screen for the first time in one of its original TV shows. The streaming giant's co-CEO Ted Sarandos said AI, which produces videos and images based on prompts, was used to create a scene of a building collapsing in the Argentine science fiction show, The Eternauts. He praised the technology as an "incredible opportunity to help creators make films and series better, not just cheaper." The use of generative AI is controversial in the entertainment industry and has sparked fears that it will replace the work of humans.


KEN: Knowledge Augmentation and Emotion Guidance Network for Multimodal Fake News Detection

arXiv.org Artificial Intelligence

In recent years, the rampant spread of misinformation on social media has made accurate detection of multimodal fake news a critical research focus. However, previous research has not adequately understood the semantics of images, and models struggle to discern news authenticity with limited textual information. Meanwhile, treating all emotional types of news uniformly without tailored approaches further leads to performance degradation. Therefore, we propose a novel Knowledge Augmentation and Emotion Guidance Network (KEN). On the one hand, we effectively leverage LVLM's powerful semantic understanding and extensive world knowledge. For images, the generated captions provide a comprehensive understanding of image content and scenes, while for text, the retrieved evidence helps break the information silos caused by the closed and limited text and context. On the other hand, we consider inter-class differences between different emotional types of news through balanced learning, achieving fine-grained modeling of the relationship between emotional types and authenticity. Extensive experiments on two real-world datasets demonstrate the superiority of our KEN.


Generating Synthetic Data via Augmentations for Improved Facial Resemblance in DreamBooth and InstantID

arXiv.org Artificial Intelligence

Personalizing Stable Diffusion for professional portrait generation from amateur photos faces challenges in maintaining facial resemblance. This paper evaluates the impact of augmentation strategies on two personalization methods: DreamBooth and InstantID. W e compare classical augmentations (flipping, cropping, color adjustments) with generative augmentation using InstantID's synthetic images to enrich training data. Using SDXL and a new FaceDistance metric based on FaceNet, we quantitatively assess facial similarity. Results show classical augmentations can cause artifacts harming identity retention, while InstantID improves fidelity when balanced with real images to avoid overfitting. A user study with 97 participants confirms high photorealism and preferences for InstantID's polished look versus DreamBooth's identity accuracy. Our findings inform effective augmentation strategies for personalized text-to-image generation.


crowd-hpo: Realistic Hyperparameter Optimization and Benchmarking for Learning from Crowds with Noisy Labels

arXiv.org Artificial Intelligence

Crowdworking is a cost-efficient solution for acquiring class labels. Since these labels are subject to noise, various approaches to learning from crowds have been proposed. Typically, these approaches are evaluated with default hyperparameter configurations, resulting in unfair and suboptimal performance, or with hyperparameter configurations tuned via a validation set with ground truth class labels, representing an often unrealistic scenario. Moreover, both setups can produce different approach rankings, complicating study comparisons. Therefore, we introduce crowd-hpo as a framework for evaluating approaches to learning from crowds in combination with criteria to select well-performing hyperparameter configurations with access only to noisy crowd-labeled validation data. Extensive experiments with neural networks demonstrate that these criteria select hyperparameter configurations, which improve the learning from crowd approaches' generalization performances, measured on separate test sets with ground truth labels. Hence, incorporating such criteria into experimental studies is essential for enabling fairer and more realistic benchmarking.


Large Language Models' Internal Perception of Symbolic Music

arXiv.org Artificial Intelligence

Large language models (LLMs) excel at modeling relationships between strings in natural language and have shown promise in extending to other symbolic domains like coding or mathematics. However, the extent to which they implicitly model symbolic music remains underexplored. This paper investigates how LLMs represent musical concepts by generating symbolic music data from textual prompts describing combinations of genres and styles, and evaluating their utility through recognition and generation tasks. We produce a dataset of LLM-generated MIDI files without relying on explicit musical training. We then train neural networks entirely on this LLM-generated MIDI dataset and perform genre and style classification as well as melody completion, benchmark-ing their performance against established models. Our results demonstrate that LLMs can infer rudimentary musical structures and temporal relationships from text, highlighting both their potential to implicitly encode musical patterns and their limitations due to a lack of explicit musical context, shedding light on their generative capabilities for symbolic music.


TransEvalnia: Reasoning-based Evaluation and Ranking of Translations

arXiv.org Artificial Intelligence

We present TransEvalnia, a prompting-based translation evaluation and ranking system that uses reasoning in performing its evaluations and ranking. This system presents fine-grained evaluations based on a subset of the Multidimensional Quality Metrics (https://themqm.org/), returns an assessment of which translation it deems the best, and provides numerical scores for the various dimensions and for the overall translation. We show that TransEvalnia performs as well as or better than the state-of-the-art MT-Ranker (Moosa et al. 2024) on our own English-Japanese data as well as several language pairs from various WMT shared tasks. Using Anthropic's Claude-3.5-Sonnet and Qwen-2.5-72B-Instruct as the evaluation LLMs, we show that the evaluations returned are deemed highly acceptable to human raters, and that the scores assigned to the translations by Sonnet, as well as other LLMs, correlate well with scores assigned by the human raters. We also note the sensitivity of our system -- as well as MT-Ranker -- to the order in which the translations are presented, and we propose methods to address this position bias. All data, including the system's evaluation and reasoning, human assessments, as well as code is released.


Social and Political Framing in Search Engine Results

arXiv.org Artificial Intelligence

Search engines play a crucial role in shaping public discourse by influencing how information is accessed and framed. While prior research has extensively examined various dimensions of search bias -- such as content prioritization, indexical bias, political polarization, and sources of bias -- an important question remains underexplored: how do search engines and ideologically-motivated user queries contribute to bias in search results. This study analyzes the outputs of major search engines using a dataset of political and social topics. The findings reveal that search engines not only prioritize content in ways that reflect underlying biases but also that ideologically-driven user queries exacerbate these biases, resulting in the amplification of specific narratives. Moreover, significant differences were observed across search engines in terms of the sources they prioritize. These results suggest that search engines may play a pivotal role in shaping public perceptions by reinforcing ideological divides, thereby contributing to the broader issue of information polarization.


How I'd set up a Roku for a 90-year-old

PCWorld

A couple weeks ago, a reader asked me about the best streaming TV setup for a 90-year-old neighbor who is not tech-savvy. My mind immediately jumped to Roku, whose smart TVs and streaming players have always emphasized simplicity. But I also know that Roku's streaming platform has become more complicated in recent years, and its once-basic menu system is not what it used to be. While I'd still recommend Roku to someone who's on the lower end of the tech learning curve, our neighbor in this scenario would benefit from some out-of-the-box settings tweaks. Whether you're setting up a Roku for yourself of someone else, here's how to make the streamer as easy to use as possible: Roku is now requiring new users to put a payment method on file during setup.