AITopics

doi: 10.5281/zenodo.17496659

2510.04688

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.74)

Kamuni, Navin, Jindal, Mayank, Soni, Arpita, Mallreddy, Sukender Reddy, Macha, Sharath Chandra

A Novel Audio Representation for Music Genre Identification in MIR

arXiv.org Artificial IntelligenceApr-1-2024

For Music Information Retrieval downstream tasks, the most common audio representation is time-frequency-based, such as Mel spectrograms. In order to identify musical genres, this study explores the possibilities of a new form of audio representation one of the most usual MIR downstream tasks. Therefore, to discretely encoding music using deep vector quantization; a novel audio representation was created for the innovative generative music model i.e. Jukebox. The effectiveness of Jukebox's audio representation is compared to Mel spectrograms using a dataset that is almost equivalent to State-of-the-Art (SOTA) and an almost same transformer design. The results of this study imply that, at least when the transformers are pretrained using a very modest dataset of 20k tracks, Jukebox's audio representation is not superior to Mel spectrograms. This could be explained by the fact that Jukebox's audio representation does not sufficiently take into account the peculiarities of human hearing perception. On the other hand, Mel spectrograms are specifically created with the human auditory sense in mind.

mel spectrogram, representation, spectrogram, (15 more...)

2404.01058

Country:

North America > United States > California > Orange County > Laguna Hills (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Leisure & Entertainment (0.88)
Media > Music (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

#artificialintelligenceApr-6-2023, 16:27:42 GMT

Jukebox

This has led to impressive results like producing Bach chorals,[ reference-5][ reference-6] polyphonic music with multiple instruments,[ reference-7][ reference-8][ reference-9] as well as minute long musical pieces.[ But symbolic generators have limitations--they cannot capture human voices or many of the more subtle timbres, dynamics, and expressivity that are essential to music. A different approach[ footnote-approach] is to model music directly as raw audio.[ For comparison, GPT-2 had 1,000 timesteps and OpenAI Five took tens of thousands of timesteps per game. Thus, to learn the high level semantics of music, a model would have to deal with extremely long-range dependencies.

jukebox, music, raw audio, (2 more...)

Industry:

Media > Music (0.59)
Leisure & Entertainment (0.59)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.63)

#artificialintelligenceApr-4-2023, 08:20:16 GMT

Unleashing the Power of AI in Music: A Deep Dive into Jukebox by OpenAI

Jukebox, an innovative AI system created by OpenAI, leverages the power of deep learning to generate music, complete with lyrics and vocals, in a variety of genres and styles. By training on a dataset of 1.2 million songs, Jukebox showcases an unparalleled level of sophistication in music generation, pushing the boundaries of what AI can achieve in the creative arts. At the core of Jukebox lies a cutting-edge neural network architecture, known as a Variational Autoencoder (VAE). The VAE's role is to encode and decode the complex musical information found within the training dataset. This encoding-decoding process enables Jukebox to generate novel and diverse musical compositions by sampling from the latent space, a mathematical representation of the underlying structure of the dataset.

artist, jukebox, music, (13 more...)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Donahue, Chris, Thickstun, John, Liang, Percy

Melody transcription via generative pre-training

arXiv.org Artificial IntelligenceDec-4-2022

Despite the central role that melody plays in music perception, it remains an open challenge in music information retrieval to reliably detect the notes of the melody present in an arbitrary music recording. A key challenge in melody transcription is building methods which can handle broad audio containing any number of instrument ensembles and musical styles - existing strategies work well for some melody instruments or styles but not all. To confront this challenge, we leverage representations from Jukebox (Dhariwal et al. 2020), a generative model of broad music audio, thereby improving performance on melody transcription by $20$% relative to conventional spectrogram features. Another obstacle in melody transcription is a lack of training data - we derive a new dataset containing $50$ hours of melody transcriptions from crowdsourced annotations of broad music. The combination of generative pre-training and a new dataset for this task results in $77$% stronger performance on melody transcription relative to the strongest available baseline. By pairing our new melody transcription approach with solutions for beat detection, key estimation, and chord recognition, we build Sheet Sage, a system capable of transcribing human-readable lead sheets directly from music audio. Audio examples can be found at https://chrisdonahue.com/sheetsage and code at https://github.com/chrisdonahue/sheetsage .

information retrieval, machine learning, natural language, (19 more...)

2212.01884

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report (0.82)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)

#artificialintelligenceOct-11-2022, 16:12:32 GMT

AI music generators could be a boon for artists -- but also problematic

It was only five years ago that electronic punk band YACHT entered the recording studio with a daunting task: They would train an AI on 14 years of their music, then synthesize the results into the album "Chain Tripping." "I'm not interested in being a reactionary," YACHT member and tech writer Claire L. Evans said in a documentary about the album. "I don't want to return to my roots and play acoustic guitar because I'm so freaked out about the coming robot apocalypse, but I also don't want to jump into the trenches and welcome our new robot overlords either." But our new robot overlords are making a whole lot of progress in the space of AI music generation. Even though the Grammy-nominated "Chain Tripping" was released in 2019, the technology behind it is already becoming outdated.

dance diffusion, diffusion, wp-content upload 2022 10, (14 more...)

Country:

North America > United States > Nebraska > Lancaster County > Lincoln (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom (0.04)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.53)

#artificialintelligenceOct-8-2022, 18:10:46 GMT

AI music generators could be a boon for artists -- but also problematic

dance diffusion, diffusion, wp-content upload 2022 10, (14 more...)

Country:

North America > United States > Nebraska > Lancaster County > Lincoln (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > United Kingdom (0.04)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.52)

#artificialintelligenceOct-7-2022, 17:40:10 GMT

Google's new AI can hear a snippet of song--and then keep on playing

AI-generated audio is commonplace: voices on home assistants like Alexa use natural language processing. AI music systems like OpenAI's Jukebox have already generated impressive results, but most existing techniques need people to prepare transcriptions and label text-based training data, which takes a lot of time and human labor. Jukebox, for example, uses text-based data to generate song lyrics. AudioLM, described in a non-peer-reviewed paper last month, is different: it doesn't require transcription or labeling. Instead, sound databases are fed into the program, and machine learning is used to compress the audio files into sound snippets, called "tokens," without losing too much information.

google, snippet, use natural language processing, (5 more...)

AI-Alerts: 2022 > 2022-10 > AAAI AI-Alert for Oct 11, 2022 (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.43)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.43)

arXiv.org Artificial IntelligenceSep-21-2022

Transfer Learning with Jukebox for Music Source Separation

Amri, W. Zai El, Tautz, O., Ritter, H., Melnik, A.

In this work, we demonstrate how a publicly available, pre-trained Jukebox model can be adapted for the problem of audio source separation from a single mixed audio channel. Our neural network architecture, which is using transfer learning, is quick to train and the results demonstrate performance comparable to other state-of-the-art approaches that require a lot more compute resources, training data, and time. We provide an open-source code implementation of our architecture (https://github.com/wzaielamri/unmix)

artificial intelligence, machine learning, separation, (17 more...)

doi: 10.1007/978-3-031-08337-2_35

2111.142

Country: Europe > Germany (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

#artificialintelligenceJun-14-2022, 14:50:51 GMT

Do We Rage Against the AI Machine?

The Industrial Revolution was a time of great change. With the steam engine, industries shifted away from skilled human labour towards mechanisation and machinery. As a result, many specialised workers lost their jobs and were forced to adapt to their new reality. The Luddites, a radical organisation of textile workers who were made redundant by textile machines, retaliated by destroying these machines and assassinated business owners. The Luddites gained public sympathy as many were afraid that they, like the retrenched textile workers, would lose their jobs to automated machinery.

artificial intelligence, neural network, revolution, (15 more...)

Country:

North America > United States (0.29)
Europe > United Kingdom (0.15)
Asia > China (0.05)
Oceania > Australia (0.05)

Industry:

Government > Regional Government (0.95)
Information Technology (0.71)
Government > Immigration & Customs (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)