AITopics | long-range correlation

The challenge of realistic music generation: modelling raw audio at scale

Neural Information Processing SystemsMar-16-2026, 19:55:25 GMT

Realistic music generation is a challenging task. When building generative models of music that are learnt from data, typically high-level representations such as scores or MIDI are used that abstract away the idiosyncrasies of a particular performance. But these nuances are very important for our perception of musicality and realism, so in this work we embark on modelling music in the raw audio domain. It has been shown that autoregressive models excel at generating raw audio waveforms of speech, but when applied to music, we find them biased towards capturing local signal structure at the expense of modelling long-range correlations. This is problematic because music exhibits structure at many different timescales. In this work, we explore autoregressive discrete autoencoders (ADAs) as a means to enable autoregressive models to capture long-range correlations in waveforms. We find that they allow us to unconditionally generate piano music directly in the raw audio domain, which shows stylistic consistency across tens of seconds.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

The challenge of realistic music generation: modelling raw audio at scale

Neural Information Processing SystemsNov-20-2025, 22:06:20 GMT

Realistic music generation is a challenging task. When building generative models of music that are learnt from data, typically high-level representations such as scores or MIDI are used that abstract away the idiosyncrasies of a particular performance. But these nuances are very important for our perception of musicality and realism, so in this work we embark on modelling music in the raw audio domain. It has been shown that autoregressive models excel at generating raw audio waveforms of speech, but when applied to music, we find them biased towards capturing local signal structure at the expense of modelling long-range correlations. This is problematic because music exhibits structure at many different timescales. In this work, we explore autoregressive discrete autoencoders (ADAs) as a means to enable autoregressive models to capture long-range correlations in waveforms. We find that they allow us to unconditionally generate piano music directly in the raw audio domain, which shows stylistic consistency across tens of seconds.

name change, raw audio, realistic music generation, (6 more...)

Neural Information Processing Systems

Industry:

Media > Music (0.67)
Leisure & Entertainment (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

The machine learning victories at the 2024 Nobel Prize Awards and how to explain them

AIHubNov-8-2024, 11:10:07 GMT

Anna Demming reports on what the prizes were awarded for and how finding connections between the two approaches to machine learning may help towards explaining how "black box" algorithms reach their conclusions. Few saw it coming when on 8th October 2024 the Nobel Committee awarded the 2024 Nobel Prize for Physics to John Hopfield for his Hopfield networks and Geoffrey Hinton for his Boltzmann machines as seminal developments towards machine learning that have statistical physics at the heart of them. The next day machine learning albeit using a different architecture bagged half of the Nobel Prize for Chemistry as well, with the award going to Demis Hassabis and John Jumper for the development of an algorithm that predicts protein folding conformations. The other half of the Chemistry Nobel was awarded to David Baker for successfully building new proteins. While the AI takeover at this year's Nobel announcements for Physics and Chemistry came as surprise to most, there has been some keen interest on how these apparently different approaches to machine learning might actually reduce to the same thing, revealing new ways of extracting some fundamental explainability from the generative AI algorithms that have so far been considered effectively "black boxes".

artificial intelligence, hopfield network, machine learning, (15 more...)

AIHub

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Europe > United Kingdom > England > Bristol (0.05)
Europe > Norway (0.05)
Europe > Austria (0.05)

Genre: Personal > Honors (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

A Cosmic-Scale Benchmark for Symmetry-Preserving Data Processing

Balla, Julia, Mishra-Sharma, Siddharth, Cuesta-Lazaro, Carolina, Jaakkola, Tommi, Smidt, Tess

arXiv.org Artificial IntelligenceOct-27-2024

Efficiently processing structured point cloud data while preserving multiscale information is a key challenge across domains, from graphics to atomistic modeling. Using a curated dataset of simulated galaxy positions and properties, represented as point clouds, we benchmark the ability of graph neural networks to simultaneously capture local clustering environments and long-range correlations. Given the homogeneous and isotropic nature of the Universe, the data exhibits a high degree of symmetry. We therefore focus on evaluating the performance of Euclidean symmetry-preserving ($E(3)$-equivariant) graph neural networks, showing that they can outperform non-equivariant counterparts and domain-specific information extraction techniques in downstream performance as well as simulation-efficiency. However, we find that current architectures fail to capture information from long-range correlations as effectively as domain-specific baselines, motivating future work on architectures better suited for extracting long-range information.

artificial intelligence, information, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2410.20516

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (0.68)
Information Technology (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Assessing the importance of long-range correlations for deep-learning-based sleep staging

Wang, Tiezhi, Strodthoff, Nils

arXiv.org Artificial IntelligenceFeb-22-2024

This study aims to elucidate the significance of long-range correlations for deep-learning-based sleep staging. It is centered around S4Sleep(TS), a recently proposed model for automated sleep staging. This model utilizes electroencephalography (EEG) as raw time series input and relies on structured state space sequence (S4) models as essential model component. Although the model already surpasses state-of-the-art methods for a moderate number of 15 input epochs, recent literature results suggest potential benefits from incorporating very long correlations spanning hundreds of input epochs. In this submission, we explore the possibility of achieving further enhancements by systematically scaling up the model's input size, anticipating potential improvements in prediction accuracy. In contrast to findings in literature, our results demonstrate that augmenting the input size does not yield a significant enhancement in the performance of S4Sleep(TS). These findings, coupled with the distinctive ability of S4 models to capture long-range dependencies in time series data, cast doubt on the diagnostic relevance of very long-range interactions for sleep staging.

s4sleep, sleep staging, staging, (16 more...)

arXiv.org Artificial Intelligence

2402.17779

Country:

Europe > Germany > Lower Saxony > Gottingen (0.05)
North America > United States > Illinois > Cook County > Westchester (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dynamic Texture Synthesis by Incorporating Long-range Spatial and Temporal Correlations

Zhang, Kaitai, Wang, Bin, Chen, Hong-Shuo, Wang, Ye, Mou, Shiyu, Kuo, C. -C. Jay

arXiv.org Artificial IntelligenceApr-14-2021

The main challenge of dynamic texture synthesis lies in how to maintain spatial and temporal consistency in synthesized videos. The major drawback of existing dynamic texture synthesis models comes from poor treatment of the long-range texture correlation and motion information. To address this problem, we incorporate a new loss term, called the Shifted Gram loss, to capture the structural and long-range correlation of the reference texture video. Furthermore, we introduce a frame sampling strategy to exploit long-period motion across multiple frames. With these two new techniques, the application scope of existing texture synthesis models can be extended. That is, they can synthesize not only homogeneous but also structured dynamic texture patterns. Thorough experimental results are provided to demonstrate that our proposed dynamic texture synthesis model offers state-of-the-art visual performance.

correlation, dynamic texture, texture, (14 more...)

arXiv.org Artificial Intelligence

2104.0594

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing (0.68)

Add feedback

The challenge of realistic music generation: modelling raw audio at scale

Dieleman, Sander, Oord, Aaron van den, Simonyan, Karen

Neural Information Processing SystemsFeb-14-2020, 19:58:28 GMT

Realistic music generation is a challenging task. When building generative models of music that are learnt from data, typically high-level representations such as scores or MIDI are used that abstract away the idiosyncrasies of a particular performance. But these nuances are very important for our perception of musicality and realism, so in this work we embark on modelling music in the raw audio domain. It has been shown that autoregressive models excel at generating raw audio waveforms of speech, but when applied to music, we find them biased towards capturing local signal structure at the expense of modelling long-range correlations. This is problematic because music exhibits structure at many different timescales. In this work, we explore autoregressive discrete autoencoders (ADAs) as a means to enable autoregressive models to capture long-range correlations in waveforms.

long-range correlation, raw audio, realistic music generation, (3 more...)

Neural Information Processing Systems

Industry:

Media > Music (0.65)
Leisure & Entertainment (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Why the Brain Is So Noisy - Issue 68: Context

NautilusJan-18-2019, 02:51:43 GMT

One of the core challenges of modern AI can be demonstrated with a rotating yellow school bus. When viewed head-on on a country road, a deep-learning neural network confidently and correctly identifies the bus. When it is laid on its side across the road, though, the algorithm believes--again, with high confidence--that it's a snowplow. Seen from underneath and at an angle, it is definitely a garbage truck. The problem is one of context. When a new image is sufficiently different from the set of training images, deep learning visual recognition stumbles, even if the difference comes down to a simple rotation or obstruction.

artificial intelligence, brain, machine learning, (15 more...)

Nautilus

Industry:

Transportation > Ground > Road (0.54)
Health & Medicine > Therapeutic Area > Neurology (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Filters

Collaborating Authors

long-range correlation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

The challenge of realistic music generation: modelling raw audio at scale

The challenge of realistic music generation: modelling raw audio at scale

The machine learning victories at the 2024 Nobel Prize Awards and how to explain them

A Cosmic-Scale Benchmark for Symmetry-Preserving Data Processing

Assessing the importance of long-range correlations for deep-learning-based sleep staging

Dynamic Texture Synthesis by Incorporating Long-range Spatial and Temporal Correlations

The challenge of realistic music generation: modelling raw audio at scale

Why the Brain Is So Noisy - Issue 68: Context