AITopics | abc notation

Collaborating Authors

abc notation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages

Neural Information Processing SystemsJun-16-2026, 16:27:04 GMT

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment > Sports (0.67)
Education > Educational Setting > K-12 Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages

Wang, Zhilin, Zeng, Jiaqi, Delalleau, Olivier, Shin, Hoo-Chang, Soares, Felipe, Bukharin, Alexander, Evans, Ellie, Dong, Yi, Kuchaiev, Oleksii

arXiv.org Artificial IntelligenceOct-27-2025

Preference datasets are essential for training general-domain, instruction-following language models with Reinforcement Learning from Human Feedback (RLHF). Each subsequent data release raises expectations for future data collection, meaning there is a constant need to advance the quality and diversity of openly available preference data. To address this need, we introduce HelpSteer3-Preference, a permissively licensed (CC-BY-4.0), high-quality, human-annotated preference dataset comprising of over 40,000 samples. These samples span diverse real-world applications of large language models (LLMs), including tasks relating to STEM, coding and multilingual scenarios. Using HelpSteer3-Preference, we train Reward Models (RMs) that achieve top performance on RM-Bench (82.4%) and JudgeBench (73.7%). This represents a substantial improvement (~10% absolute) over the previously best-reported results from existing RMs. We demonstrate HelpSteer3-Preference can also be applied to train Generative RMs and how policy models can be aligned with RLHF using our RMs. Dataset (CC-BY-4.0): https://huggingface.co/datasets/nvidia/HelpSteer3#preference Models (NVIDIA Open Model): https://huggingface.co/collections/nvidia/reward-models-68377c5955575f71fcc7a2a3

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2505.11475

Country:

Europe (0.67)
North America > United States (0.67)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Information Technology (1.00)
Education > Educational Setting > K-12 Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

HNote: Extending YNote with Hexadecimal Encoding for Fine-Tuning LLMs in Music Modeling

Chu, Hung-Ying, Wei, Shao-Yu, Chen, Guan-Wei, Hung, Tzu-Wei, Tsai, ChengYang, Lin, Yu-Cheng

arXiv.org Artificial IntelligenceOct-7-2025

Recent advances in large language models (LLMs) have created new opportunities for symbolic music generation. However, existing formats such as MIDI, ABC, and MusicXML are either overly complex or structurally inconsistent, limiting their suitability for token-based learning architectures. To address these challenges, we propose HNote, a novel hexadecimal-based notation system extended from YNote, which encodes both pitch and duration within a fixed 32-unit measure framework. This design ensures alignment, reduces ambiguity, and is directly compatible with LLM architectures. We converted 12,300 Jiangnan-style songs generated from traditional folk pieces from YNote into HNote, and fine-tuned LLaMA-3.1(8B) using parameter-efficient LoRA. Experimental results show that HNote achieves a syntactic correctness rate of 82.5%, and BLEU and ROUGE evaluations demonstrate strong symbolic and structural similarity, producing stylistically coherent compositions. This study establishes HNote as an effective framework for integrating LLMs with cultural music modeling.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.25694

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Towards an AI Musician: Synthesizing Sheet Music Problems for Musical Reasoning

Wang, Zhilin, Yang, Zhe, Luo, Yun, Li, Yafu, Qu, Xiaoye, Qiao, Ziqian, Zhang, Haoran, Zhan, Runzhe, Wong, Derek F., Zhou, Jizhe, Cheng, Yu

arXiv.org Artificial IntelligenceSep-29-2025

Enhancing the ability of Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) to interpret sheet music is a crucial step toward building AI musicians. However, current research lacks both evaluation benchmarks and training data for sheet music reasoning. Inspired by mathematics, where simple operations yield infinite verifiable problems, we introduce a novel approach that treats core music theory rules, such as those governing beats and intervals, as programmatic functions to systematically synthesize a vast and diverse corpus of sheet music reasoning problems. This approach allows us to introduce a data synthesis framework that generates verifiable sheet music questions in both textual and visual modalities, leading to the Synthetic Sheet Music Reasoning Benchmark (SSMR-Bench) and a complementary training set. Evaluation results on SSMR-Bench highlight the key role reasoning plays in interpreting sheet music, while also pointing out the ongoing challenges in understanding sheet music in a visual format. By leveraging synthetic data for RL VR, all models show significant improvements on the SSMR-Bench. Additionally, they also demonstrate considerable advancements on previously established human-crafted benchmarks, such as MusicTheoryBench and the music subset of MMMU. Finally, our results show that the enhanced reasoning ability can also facilitate music composition. "The sheet music is the language of musicians." Recent advancements in Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) have inspired researchers to explore the potential of developing AI musicians (Qu et al., 2025; Bradshaw & Colton, 2025; Wang et al., 2024). Given that sheet music is the universal language of musicians, the ability to read and interpret it is an essential step for AI musicians (Y uan et al., 2024; Wang et al., 2025). As illustrated in Figure 1, sheet music reasoning differs fundamentally from Music Knowledge QA (Li et al., 2024), which evaluates memorized knowledge, and from sheet music recognition (Chen et al., 2025a), which focuses on identifying notation from images.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2509.04059

Country: Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach

Zhao, Zijian, Jin, Dian, Zhou, Zijing

arXiv.org Artificial IntelligenceSep-29-2025

Recently, Image-to-Music (I2M) generation has garnered significant attention, with potential applications in fields such as gaming, advertising, and multi-modal art creation. However, due to the ambiguous and subjective nature of I2M tasks, most end-to-end methods lack interpretability, leaving users puzzled about the generation results. Even methods based on emotion mapping face controversy, as emotion represents only a singular aspect of art. Additionally, most learning-based methods require substantial computational resources and large datasets for training, hindering accessibility for common users. To address these challenges, we propose the first Vision Language Model (VLM)-based I2M framework that offers high interpretability and low computational cost. Specifically, we utilize ABC notation to bridge the text and music modalities, enabling the VLM to generate music using natural language. We then apply multi-modal Retrieval-Augmented Generation (RAG) and self-refinement techniques to allow the VLM to produce high-quality music without external training. Furthermore, we leverage the generated motivations in text and the attention maps from the VLM to provide explanations for the generated results in both text and image modalities. To validate our method, we conduct both human studies and machine evaluations, where our method outperforms others in terms of music quality and music-image consistency, indicating promising results. Our code is available at https://github.com/RS2002/Image2Music .

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.22378

Country: Asia > China (0.16)

Genre: Research Report (0.82)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

CoComposer: LLM Multi-agent Collaborative Music Composition

Xing, Peiwen, Plaat, Aske, van Stein, Niki

arXiv.org Artificial IntelligenceSep-3-2025

Existing AI Music composition tools are limited in generation duration, musical quality, and controllability. We introduce CoComposer, a multi-agent system that consists of five collaborating agents, each with a task based on the traditional music composition workflow. Using the AudioBox-Aesthetics system, we experimentally evaluate CoComposer on four compositional criteria. We test with three LLMs (GPT-4o, DeepSeek-V3-0324, Gemini-2.5-Flash), and find (1) that CoComposer outperforms existing multi-agent LLM-based systems in music quality, and (2) compared to a single-agent system, in production complexity. Compared to non- LLM MusicLM, CoComposer has better interpretability and editability, although MusicLM still produces better music.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.00132

Genre: Research Report (0.64)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Via Score to Performance: Efficient Human-Controllable Long Song Generation with Bar-Level Symbolic Notation

Wang, Tongxi, Yu, Yang, Wang, Qing, Qian, Junlang

arXiv.org Artificial IntelligenceAug-5-2025

Song generation is regarded as the most challenging problem in music AIGC; nonetheless, existing approaches have yet to fully overcome four persistent limitations: controllability, generalizability, perceptual quality, and duration. We argue that these shortcomings stem primarily from the prevailing paradigm of attempting to learn music theory directly from raw audio, a task that remains prohibitively difficult for current models. To address this, we present Bar-level AI Composing Helper (BACH), the first model explicitly designed for song generation through human-editable symbolic scores. BACH introduces a tokenization strategy and a symbolic generative procedure tailored to hierarchical song structure. Consequently, it achieves substantial gains in the efficiency, duration, and perceptual quality of song generation. Experiments demonstrate that BACH, with a small model size, establishes a new SOTA among all publicly reported song generation systems, even surpassing commercial solutions such as Suno. Human evaluations further confirm its superiority across multiple subjective metrics.

machine learning, natural language, song generation, (15 more...)

arXiv.org Artificial Intelligence

2508.01394

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Exploring Tokenization Methods for Multitrack Sheet Music Generation

Wang, Yashan, Wu, Shangda, Du, Xingjian, Sun, Maosong

arXiv.org Artificial IntelligenceOct-23-2024

This study explores the tokenization of multitrack sheet music in ABC notation, introducing two methods--bar-stream and line-stream patching. We compare these methods against existing techniques, including bar patching, byte patching, and Byte Pair Encoding (BPE). In terms of both computational efficiency and the musicality of the generated compositions, experimental results show that bar-stream patching performs best overall compared to the others, which makes it a promising tokenization strategy for sheet music generation.

machine learning, natural language, score text, (14 more...)

arXiv.org Artificial Intelligence

2410.17584

Country:

North America > United States > California > San Francisco County > San Francisco (0.15)
Asia > China (0.05)

Genre: Research Report > New Finding (0.35)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

Audio-to-Score Conversion Model Based on Whisper methodology

Zhang, Hongyao, Sun, Bohang

arXiv.org Artificial IntelligenceOct-22-2024

This thesis develops a Transformer model based on Whisper, which extracts melodies and chords from music audio and records them into ABC notation. A comprehensive data processing workflow is customized for ABC notation, including data cleansing, formatting, and conversion, and a mutation mechanism is implemented to increase the diversity and quality of training data. This thesis innovatively introduces the "Orpheus' Score", a custom notation system that converts music information into tokens, designs a custom vocabulary library, and trains a corresponding custom tokenizer. Experiments show that compared to traditional algorithms, the model has significantly improved accuracy and performance. While providing a convenient audio-to-score tool for music enthusiasts, this work also provides new ideas and tools for research in music information processing.

data quality, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.17209

Country:

Europe > Italy > Lombardy > Milan (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.50)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Quality (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models

Wu, Shangda, Wang, Yashan, Yuan, Ruibin, Guo, Zhancheng, Tan, Xu, Zhang, Ge, Zhou, Monan, Chen, Jing, Mu, Xuefeng, Gao, Yuejie, Dong, Yuanliang, Liu, Jiafeng, Li, Xiaobing, Yu, Feng, Sun, Maosong

arXiv.org Artificial IntelligenceOct-17-2024

Challenges in managing linguistic diversity and integrating various musical modalities are faced by current music information retrieval systems. These limitations reduce their effectiveness in a global, multimodal music environment. To address these issues, we introduce CLaMP 2, a system compatible with 101 languages that supports both ABC notation (a text-based musical notation format) and MIDI (Musical Instrument Digital Interface) for music information retrieval. CLaMP 2, pre-trained on 1.5 million ABC-MIDI-text triplets, includes a multilingual text encoder and a multimodal music encoder aligned via contrastive learning. By leveraging large language models, we obtain refined and consistent multilingual descriptions at scale, significantly reducing textual noise and balancing language distribution. Our experiments show that CLaMP 2 achieves state-of-the-art results in both multilingual semantic search and music classification across modalities, thus establishing a new standard for inclusive and global music information retrieval.

information retrieval, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.13267

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Rhode Island (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(14 more...)

Genre: Research Report (0.64)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)
Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback