Goto

Collaborating Authors

 popularity






AdaptiveReducedRankRegression

Neural Information Processing Systems

Thissettingfrequently arisesinpractice because it is often straightforward to perform feature-engineering and produce a large number of potentially useful features in many machine learning problems. For example, in a typical equity forecasting model,n is around 3,000 (i.e., using 10 years of market data), whereas the number of potentially relevant features can be in the order of thousands [36, 24, 26, 12].


No, the Freecash App Won't Pay You to Scroll TikTok

WIRED

Freecash will actually pay money out to users but not for watching videos. This misleading marketing coincides with the app's rising popularity. I first encountered the Freecash app after clicking on a sponsored TikTok video with dubious claims. The advertisement didn't promote this app by name, rather it showed a young woman expressing her excitement about seemingly getting hired by TikTok at $35 an hour to watch videos on her "For You" page. When I tapped the link to "order now," it sent me to a website with TikTok and Freecash logos, featuring a download link for the Freecash app.


Elon Musk's stubborn spin on Grok's sexualized images controversy

The Guardian

Elon Musk has been promoting Grok's popularity as if it were a piece of productivity software. Elon Musk has been promoting Grok's popularity as if it were a piece of productivity software. Today, we discuss Elon Musk's rosy depiction of Grok's image generation controversy; the seven-figure panic among Silicon Valley billionaires over a proposed wealth tax in California, though with one notable exception; and how AI and robotics have revitalized the Consumer Electronics Showcase. The firestorm over the Grok AI tool has been raging for more than a week now, and it shows no signs of dying down. Last week, I wrote about the rising backlash against Elon Musk's Grok AI tool, which in recent weeks has allowed users to generate thousands of sexualized images of women.


CRAG - Comprehensive RAG Benchmark

Neural Information Processing Systems

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)'s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4,409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA.


Source Coverage and Citation Bias in LLM-based vs. Traditional Search Engines

Zhang, Peixian, Ye, Qiming, Peng, Zifan, Garimella, Kiran, Tyson, Gareth

arXiv.org Artificial Intelligence

LLM-based Search Engines (LLM-SEs) introduces a new paradigm for information seeking. Unlike Traditional Search Engines (TSEs) (e.g., Google), these systems summarize results, often providing limited citation transparency. The implications of this shift remain largely unexplored, yet raises key questions regarding trust and transparency. In this paper, we present a large-scale empirical study of LLM-SEs, analyzing 55,936 queries and the corresponding search results across six LLM-SEs and two TSEs. We confirm that LLM-SEs cites domain resources with greater diversity than TSEs. Indeed, 37% of domains are unique to LLM-SEs. However, certain risks still persist: LLM-SEs do not outperform TSEs in credibility, political neutrality and safety metrics. Finally, to understand the selection criteria of LLM-SEs, we perform a feature-based analysis to identify key factors influencing source choice. Our findings provide actionable insights for end users, website owners, and developers.


Lyrics Matter: Exploiting the Power of Learnt Representations for Music Popularity Prediction

Choudhary, Yash, Rao, Preeti, Bhattacharyya, Pushpak

arXiv.org Artificial Intelligence

Accurately predicting music popularity is a critical challenge in the music industry, offering benefits to artists, producers, and streaming platforms. Prior research has largely focused on audio features, social metadata, or model architectures. This work addresses the under-explored role of lyrics in predicting popularity. We present an automated pipeline that uses LLM to extract high-dimensional lyric embeddings, capturing semantic, syntactic, and sequential information. These features are integrated into HitMusicLyricNet, a multimodal architecture that combines audio, lyrics, and social metadata for popularity score prediction in the range 0-100. Our method outperforms existing baselines on the SpotGenTrack dataset, which contains over 100,000 tracks, achieving 9% and 20% improvements in MAE and MSE, respectively. Ablation confirms that gains arise from our LLM-driven lyrics feature pipeline (LyricsAENet), underscoring the value of dense lyric representations.