lyric
Ranking the ten best Billy Joel songs of all time in honor of The Piano Man's 77th birthday
Paige Spiranac hits bombs at Truist pro-am after years of being shunned, fighter jets interrupt golf & MEAT! Disney's big mistake with Star Wars was turning Luke Skywalker into Mark Hamill: miserable, pathetic and sad WWE US Champion Tiffany Stratton takes her new belt for a celebratory ride on a jet ski, moose delay & MEAT! Nick Bosa's model girlfriend starts summer in a pink bikini on a tennis court, crazy Mark Hamill & plandemic! Best friend booted from wedding for bride's bachelorette cheating, sugar daddy has money troubles & Reno Ruth Taylor Sheridan's hit CIA/military series gets major update ahead of new season premiere Smokin' Charley Hull is back to promoting nicotine after giving up the cigs, Mets booth mess & steak tacos! Hayden Panettiere has a very important message to share with everyone, she's into women too Cameron Brink explores the jungle in a bikini before WNBA tip, Italian PM posts some thirst & woke Star Wars! Perez Hilton heaps praise on Ivanka Trump, takes swipe at Kardashians during appearance on Tomi Lahren's show I don't buy that Iran has a'divided government,' US Navy captain says Democratic congressman blames Trump for disruption of world's oil supply Putin is'really worried' about Ukrainian drone strikes: National security expert OH, DEER!: Nursing home receives unexpected visitor Does the U.S. Still Need NATO?
- North America > United States > New York (0.31)
- Asia > Middle East > Iran (0.24)
- Media > Film (1.00)
- Government > Military (1.00)
- Government > Regional Government > North America Government > United States Government (0.89)
- Leisure & Entertainment > Sports > Tennis (0.55)
- Information Technology > Communications > Social Media (0.71)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.54)
Pop music has gotten sadder over the last 50 years
Analysis of 20,186 songs from the Billboard Top 100 indicates that the lyrics are also more simple. Breakthroughs, discoveries, and DIY tips sent every weekday. Debating the merits of today's popular music versus the hits of the past is largely a matter of taste. But regardless of your opinion on the subject, one thing is clear: pop music is objectively darker and more stressful than ever. The compelling statistics are laid out in a study by University of Vienna psychologists recently published in the journal .
- Europe > Austria > Vienna (0.25)
- North America > United States (0.05)
- Europe > Sweden (0.05)
- (2 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Who Will Top the Charts? Multimodal Music Popularity Prediction via Adaptive Fusion of Modality Experts and Temporal Engagement Modeling
Choudhary, Yash, Rao, Preeti, Bhattacharyya, Pushpak
Predicting a song's commercial success prior to its release remains an open and critical research challenge for the music industry. Early prediction of music popularity informs strategic decisions, creative planning, and marketing. Existing methods suffer from four limitations:(i) temporal dynamics in audio and lyrics are averaged away; (ii) lyrics are represented as a bag of words, disregarding compositional structure and affective semantics; (iii) artist- and song-level historical performance is ignored; and (iv) multimodal fusion approaches rely on simple feature concatenation, resulting in poorly aligned shared representations. To address these limitations, we introduce GAMENet, an end-to-end multimodal deep learning architecture for music popularity prediction. GAMENet integrates modality-specific experts for audio, lyrics, and social metadata through an adaptive gating mechanism. We use audio features from Music4AllOnion processed via OnionEnsembleAENet, a network of autoencoders designed for robust feature extraction; lyric embeddings derived through a large language model pipeline; and newly introduced Career Trajectory Dynamics (CTD) features that capture multi-year artist career momentum and song-level trajectory statistics. Using the Music4All dataset (113k tracks), previously explored in MIR tasks but not popularity prediction, GAMENet achieves a 12% improvement in R^2 over direct multimodal feature concatenation. Spotify audio descriptors alone yield an R^2 of 0.13. Integrating aggregate CTD features increases this to 0.69, with an additional 7% gain from temporal CTD features. We further validate robustness using the SpotGenTrack Popularity Dataset (100k tracks), achieving a 16% improvement over the previous baseline. Extensive ablations confirm the model's effectiveness and the distinct contribution of each modality.
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction
Choudhary, Yash, Rao, Preeti, Bhattacharyya, Pushpak
Accurately predicting music popularity is a critical challenge in the music industry, offering benefits to artists, producers, and streaming platforms. Prior research has largely focused on audio features, social metadata, or model architectures. This work addresses the under-explored role of lyrics in predicting popularity. We present an automated pipeline that uses LLM to extract high-dimensional lyric embeddings, capturing semantic, syntactic, and sequential information. These features are integrated into HitMusicLyricNet, a multimodal architecture that combines audio, lyrics, and social metadata for popularity score prediction in the range 0-100. Our method outperforms existing baselines on the SpotGenTrack dataset, which contains over 100,000 tracks, achieving 9% and 20% improvements in MAE and MSE, respectively. Ablation confirms that gains arise from our LLM-driven lyrics feature pipeline (LyricsAENet), underscoring the value of dense lyric representations.
- Asia (0.68)
- North America > United States > Minnesota (0.28)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance
Zheng, Junjie, Hao, Chunbo, Ma, Guobin, Zhang, Xiaoyu, Chen, Gongyu, Ding, Chaofan, Chen, Zihao, Xie, Lei
Singing Voice Synthesis (SVS) remains constrained in practical deployment due to its strong dependence on accurate phoneme-level alignment and manually annotated melody contours, requirements that are resource-intensive and hinder scalability. To overcome these limitations, we propose a melody-driven SVS framework capable of synthesizing arbitrary lyrics following any reference melody, without relying on phoneme-level alignment. Our method builds on a Diffusion Transformer (DiT) architecture, enhanced with a dedicated melody extraction module that derives melody representations directly from reference audio. To ensure robust melody encoding, we employ a teacher model to guide the optimization of the melody extractor, alongside an implicit alignment mechanism that enforces similarity distribution constraints for improved melodic stability and coherence. Additionally, we refine duration modeling using weakly annotated song data and introduce a Flow-GRPO reinforcement learning strategy with a multi-objective reward function to jointly enhance pronunciation clarity and melodic fidelity. Experiments show that our model achieves superior performance over existing approaches in both objective measures and subjective listening tests, especially in zero-shot and lyric adaptation settings, while maintaining high audio quality without manual annotation. This work offers a practical and scalable solution for advancing data-efficient singing voice synthesis. To support reproducibility, we release our inference code and model checkpoints.
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
A Comparative Study of LLM Prompting and Fine-Tuning for Cross-genre Authorship Attribution on Chinese Lyrics
Li, Yuxin, Xu, Lorraine, Wang, Meng Fan
We propose a novel study on authorship attribution for Chinese lyrics, a domain where clean, public datasets are sorely lacking. Our contributions are twofold: (1) we create a new, balanced dataset of Chinese lyrics spanning multiple genres, and (2) we develop and fine-tune a domain-specific model, comparing its performance against zero-shot inference using the DeepSeek LLM. We test two central hypotheses. First, we hypothesize that a fine-tuned model will outperform a zero-shot LLM baseline. Second, we hypothesize that performance is genre-dependent. Our experiments strongly confirm Hypothesis 2: structured genres (e.g. Folklore & Tradition) yield significantly higher attribution accuracy than more abstract genres (e.g. Love & Romance). Hypothesis 1 receives only partial support: fine-tuning improves robustness and generalization in Test1 (real-world data and difficult genres), but offers limited or ambiguous gains in Test2, a smaller, synthetically-augmented set. We show that the design limitations of Test2 (e.g., label imbalance, shallow lexical differences, and narrow genre sampling) can obscure the true effectiveness of fine-tuning. Our work establishes the first benchmark for cross-genre Chinese lyric attribution, highlights the importance of genre-sensitive evaluation, and provides a public dataset and analytical framework for future research. We conclude with recommendations: enlarge and diversify test sets, reduce reliance on token-level data augmentation, balance author representation across genres, and investigate domain-adaptive pretraining as a pathway for improved attribution performance.
MusicAIR: A Multimodal AI Music Generation Framework Powered by an Algorithm-Driven Core
Liao, Callie C., Liao, Duoduo, Zhang, Ellie L.
Recent advances in generative AI have made music generation a prominent research focus. However, many neural-based models rely on large datasets, raising concerns about copyright infringement and high-performance costs. In contrast, we propose MusicAIR, an innovative multimodal AI music generation framework powered by a novel algorithm-driven symbolic music core, effectively mitigating copyright infringement risks. The music core algorithms connect critical lyrical and rhythmic information to automatically derive musical features, creating a complete, coherent melodic score solely from the lyrics. The MusicAIR framework facilitates music generation from lyrics, text, and images. The generated score adheres to established principles of music theory, lyrical structure, and rhythmic conventions. We developed Generate AI Music (GenAIM), a web tool using MusicAIR for lyric-to-song, text-to-music, and image-to-music generation. In our experiments, we evaluated AI-generated music scores produced by the system using both standard music metrics and innovative analysis that compares these compositions with original works. The system achieves an average key confidence of 85%, outperforming human composers at 79%, and aligns closely with established music theory standards, demonstrating its ability to generate diverse, human-like compositions. As a co-pilot tool, GenAIM can serve as a reliable music composition assistant and a possible educational composition tutor while simultaneously lowering the entry barrier for all aspiring musicians, which is innovative and significantly contributes to AI for music generation.
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
That New Hit Song on Spotify? It Was Made by A.I.
That New Hit Song on Spotify? Aspiring musicians are churning out tracks using generative artificial intelligence. Some are topping the charts. Nick Arter, a thirty-five-year-old in Washington, D.C., never quite managed to become a professional musician the old-fashioned way. He grew up in Harrisburg, Pennsylvania, in a music-loving family.
- North America > United States > Pennsylvania > Dauphin County > Harrisburg (0.24)
- North America > United States > District of Columbia > Washington (0.24)
- North America > United States > New York (0.05)
- (7 more...)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Bob's Confetti: Phonetic Memorization Attacks in Music and Video Generation
Roh, Jaechul, Novack, Zachary, Peng, Yuefeng, Mireshghallah, Niloofar, Berg-Kirkpatrick, Taylor, Houmansadr, Amir
Generative AI systems for music and video commonly use text-based filters to prevent the regurgitation of copyrighted material. We expose a fundamental flaw in this approach by introducing Adversarial PhoneTic Prompting (APT), a novel attack that bypasses these safeguards by exploiting phonetic memorization. The APT attack replaces iconic lyrics with homophonic but semantically unrelated alternatives (e.g., "mom's spaghetti" becomes "Bob's confetti"), preserving acoustic structure while altering meaning; we identify high-fidelity phonetic matches using CMU pronouncing dictionary. We demonstrate that leading Lyrics-to-Song (L2S) models like SUNO and YuE regenerate songs with striking melodic and rhythmic similarity to their copyrighted originals when prompted with these altered lyrics. More surprisingly, this vulnerability extends across modalities. When prompted with phonetically modified lyrics from a song, a Text-to-Video (T2V) model like Veo 3 reconstructs visual scenes from the original music video-including specific settings and character archetypes-despite the absence of any visual cues in the prompt. Our findings reveal that models memorize deep, structural patterns tied to acoustics, not just verbatim text. This phonetic-to-visual leakage represents a critical vulnerability in transcript-conditioned generative models, rendering simple copyright filters ineffective and raising urgent concerns about the secure deployment of multimodal AI systems. Demo examples are available at our project page (https://jrohsc.github.io/music_attack/).
- North America > United States > New York (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Leveraging Whisper Embeddings for Audio-based Lyrics Matching
Mancini, Eleonora, Serrà, Joan, Torroni, Paolo, Mitsufuji, Yuki
Audio-based lyrics matching can be an appealing alternative to other content-based retrieval approaches, but existing methods often suffer from limited reproducibility and inconsistent baselines. In this work, we introduce WEALY, a fully reproducible pipeline that leverages Whisper decoder embeddings for lyrics matching tasks. WEALY establishes robust and transparent baselines, while also exploring multimodal extensions that integrate textual and acoustic features. Through extensive experiments on standard datasets, we demonstrate that WEALY achieves a performance comparable to state-of-the-art methods that lack reproducibility. In addition, we provide ablation studies and analyses on language robustness, loss functions, and embedding strategies. This work contributes a reliable benchmark for future research, and underscores the potential of speech technologies for music information retrieval tasks.
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.47)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)