Music Recommendation with Large Language Models: Challenges, Opportunities, and Evaluation
Epure, Elena V., Deldjoo, Yashar, Sguerra, Bruno, Schedl, Markus, Moussallam, Manuel
–arXiv.org Artificial Intelligence
Music Recommender Systems (MRS) have long relied on an information-retrieval framing, where progress is measured mainly through accuracy on retrieval-oriented subtasks. While effective, this reductionist paradigm struggles to address the deeper question of what makes a good recommendation, and attempts to broaden evaluation, through user studies or fairness analyses, have had limited impact. The emergence of Large Language Models (LLMs) disrupts this framework: LLMs are generative rather than ranking-based, making standard accuracy metrics questionable. They also introduce challenges such as hallucinations, knowledge cutoffs, non-determinism, and opaque training data, rendering traditional train/test protocols difficult to interpret. At the same time, LLMs create new opportunities, enabling natural-language interaction and even allowing models to act as evaluators. This work argues that the shift toward LLM-driven MRS requires rethinking evaluation. We first review how LLMs reshape user modeling, item modeling, and natural-language recommendation in music. We then examine evaluation practices from NLP, highlighting methodologies and open challenges relevant to MRS. Finally, we synthesize insights-focusing on how LLM prompting applies to MRS, to outline a structured set of success and risk dimensions. Our goal is to provide the MRS community with an updated, pedagogical, and cross-disciplinary perspective on evaluation.
arXiv.org Artificial Intelligence
Nov-21-2025
- Country:
- Asia
- China > Heilongjiang Province
- Daqing (0.04)
- Indonesia > Bali (0.04)
- Middle East
- Jordan (0.04)
- UAE > Abu Dhabi Emirate
- Abu Dhabi (0.14)
- Myanmar > Tanintharyi Region
- Dawei (0.04)
- Singapore > Central Region
- Singapore (0.04)
- Thailand > Bangkok
- Bangkok (0.04)
- China > Heilongjiang Province
- Europe
- Czechia > Prague (0.04)
- France > Île-de-France
- Croatia > Dubrovnik-Neretva County
- Dubrovnik (0.04)
- Germany > Lower Saxony
- Hanover (0.04)
- Italy
- United Kingdom > England
- Greater London > London (0.04)
- Netherlands > Utrecht (0.04)
- Middle East > Malta (0.04)
- Austria
- Upper Austria > Linz (0.04)
- Vienna (0.14)
- North America
- Canada
- British Columbia > Metro Vancouver Regional District
- Vancouver (0.04)
- Ontario > Toronto (0.04)
- British Columbia > Metro Vancouver Regional District
- Dominican Republic (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Florida > Miami-Dade County
- Miami (0.04)
- Illinois > Cook County
- Chicago (0.04)
- New Mexico > Bernalillo County
- Albuquerque (0.04)
- New York > New York County
- New York City (0.04)
- Florida > Miami-Dade County
- Canada
- Oceania > Australia
- New South Wales > Sydney (0.04)
- South America > Brazil
- Rio de Janeiro > Rio de Janeiro (0.04)
- Asia
- Genre:
- Overview (1.00)
- Questionnaire & Opinion Survey (0.86)
- Research Report > New Finding (0.92)
- Industry:
- Leisure & Entertainment (1.00)
- Media > Music (1.00)