AITopics | Défossez, Alexandre

Collaborating Authors

Défossez, Alexandre

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hybrid Spectrogram and Waveform Source Separation

Défossez, Alexandre

arXiv.org Machine LearningNov-5-2021

Source separation models either work on the spectrogram or waveform domain. In this work, we show how to perform end-to-end hybrid source separation, letting the model decide which domain is best suited for each source, and even combining both. The proposed hybrid version of the Demucs architecture won the Music Demixing Challenge 2021 organized by Sony. This architecture also comes with additional improvements, such as compressed residual branches, local attention or singular value regularization. Overall, a 1.4 dB improvement of the Signal-To-Distortion (SDR) was observed across all sources as measured on the MusDB HQ dataset, an improvement confirmed by human subjective evaluation, with an overall quality rated at 2.83 out of 5 (2.36 for the non hybrid Demucs), and absence of contamination at 3.04 (against 2.37 for the non hybrid Demucs and 2.44 for the second ranking model submitted at the competition).

artificial intelligence, convolution, machine learning, (13 more...)

arXiv.org Machine Learning

2111.036

Genre: Research Report (0.40)

Industry:

Media > Music (0.46)
Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Differentiable Model Compression via Pseudo Quantization Noise

Défossez, Alexandre, Adi, Yossi, Synnaeve, Gabriel

arXiv.org Artificial IntelligenceApr-20-2021

We propose to add independent pseudo quantization noise to model parameters during training to approximate the effect of a quantization operator. This method, DiffQ, is differentiable both with respect to the unquantized parameters, and the number of bits used. Given a single hyper-parameter expressing the desired balance between the quantized model size and accuracy, DiffQ can optimize the number of bits used per individual weight or groups of weights, in a single training. We experimentally verify that our method outperforms state-of-the-art quantization techniques on several benchmarks and architectures for image classification, language modeling, and audio source separation. For instance, on the Wikitext-103 language modeling benchmark, DiffQ compresses a 16 layers transformer model by a factor of 8, equivalent to 4 bits precision, while losing only 0.5 points of perplexity. Code is available at: https://github.com/facebookresearch/diffq

deep learning, neural network, quantization, (15 more...)

arXiv.org Artificial Intelligence

2104.09987

Genre: Research Report (0.50)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Vision (0.67)

Add feedback

On the Convergence of Adam and Adagrad

Défossez, Alexandre, Bottou, Léon, Bach, Francis, Usunier, Nicolas

arXiv.org Machine LearningMar-4-2020

We provide a simple proof of the convergence of the optimization algorithms Adam and Adagrad with the assumptions of smooth gradients and almost sure uniform bound on the $\ell_\infty$ norm of the gradients. This work builds on the techniques introduced by Ward et al. (2019) and extends them to the Adam optimizer. We show that in expectation, the squared norm of the objective gradient averaged over the trajectory has an upper-bound which is explicit in the constants of the problem, parameters of the optimizer and the total number of iterations N. This bound can be made arbitrarily small. In particular, Adam with a learning rate $\alpha=1/\sqrt{N}$ and a momentum parameter on squared gradients $\beta_2=1 - 1/N$ achieves the same rate of convergence $O(\ln(N)/\sqrt{N})$ as Adagrad. Thus, it is possible to use Adam as a finite horizon version of Adagrad, much like constant step size SGD can be used instead of its asymptotically converging decaying step size version.

convergence, neural network, optimization problem, (20 more...)

arXiv.org Machine Learning

2003.02395

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Demucs: Deep Extractor for Music Sources with extra unlabeled data remixed

Défossez, Alexandre, Usunier, Nicolas, Bottou, Léon, Bach, Francis

arXiv.org Machine LearningSep-3-2019

We study the problem of source separation for music using deep learning with four known sources: drums, bass, vocals and other accompaniments. State-of-the-art approaches predict soft masks over mixture spectrograms while methods working on the waveform are lagging behind as measured on the standard MusDB benchmark. Our contribution is two fold. (i) We introduce a simple convolutional and recurrent model that outperforms the state-of-the-art model on waveforms, that is, Wave-U-Net, by 1.6 points of SDR (signal to distortion ratio). (ii) We propose a new scheme to leverage unlabeled music. We train a first model to extract parts with at least one source silent in unlabeled tracks, for instance without bass. We remix this extract with a bass line taken from the supervised dataset to form a new weakly supervised training example. Combining our architecture and scheme, we show that waveform methods can play in the same ballpark as spectrogram ones.

deep learning, neural network, source separation, (20 more...)

arXiv.org Machine Learning

1909.01174

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.54)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SING: Symbol-to-Instrument Neural Generator

Défossez, Alexandre, Zeghidour, Neil, Usunier, Nicolas, Bottou, Léon, Bach, Francis

arXiv.org Machine LearningOct-23-2018

Recent progress in deep learning for audio synthesis opens the way to models that directly produce the waveform, shifting away from the traditional paradigm of relying on vocoders or MIDI synthesizers for speech or music generation. Despite their successes, current state-of-the-art neural audio synthesizers such as WaveNet and SampleRNN suffer from prohibitive training and inference times because they are based on autoregressive models that generate audio samples one at a time at a rate of 16kHz. In this work, we study the more computationally efficient alternative of generating the waveform frame-by-frame with large strides. We present SING, a lightweight neural audio synthesizer for the original task of generating musical notes given desired instrument, pitch and velocity. Our model is trained end-to-end to generate notes from nearly 1000 instruments with a single decoder, thanks to a new loss function that minimizes the distances between the log spectrograms of the generated and target waveforms. On the generalization task of synthesizing notes for pairs of pitch and instrument not seen during training, SING produces audio with significantly improved perceptual quality compared to a state-of-the-art autoencoder based on WaveNet as measured by a Mean Opinion Score (MOS), and is about 32 times faster for training and 2, 500 times faster for inference.

deep learning, neural network, waveform, (19 more...)

arXiv.org Machine Learning

1810.09785

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

AdaBatch: Efficient Gradient Aggregation Rules for Sequential and Parallel Stochastic Gradient Methods

Défossez, Alexandre, Bach, Francis

arXiv.org Machine LearningNov-6-2017

We study a new aggregation operator for gradients coming from a mini-batch for stochastic gradient (SG) methods that allows a significant speed-up in the case of sparse optimization problems. We call this method AdaBatch and it only requires a few lines of code change compared to regular mini-batch SGD algorithms. We provide a theoretical insight to understand how this new class of algorithms is performing and show that it is equivalent to an implicit per-coordinate rescaling of the gradients, similarly to what Adagrad methods can do. In theory and in practice, this new aggregation allows to keep the same sample efficiency of SG methods while increasing the batch size. Experimentally, we also show that in the case of smooth convex optimization, our procedure can even obtain a better loss when increasing the batch size for a fixed number of samples. We then apply this new algorithm to obtain a parallelizable stochastic gradient method that is synchronous but allows speed-up on par with Hogwild! methods as convergence does not deteriorate with the increase of the batch size. The same approach can be used to make mini-batch provably efficient for variance-reduced SG methods such as SVRG.

adabatch, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

1711.01761

Country:

Europe > France (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Constant Step Size Least-Mean-Square: Bias-Variance Trade-offs and Optimal Sampling Distributions

Défossez, Alexandre, Bach, Francis

arXiv.org Machine LearningNov-29-2014

We consider the least-squares regression problem and provide a detailed asymptotic analysis of the performance of averaged constant-step-size stochastic gradient descent (a.k.a. least-mean-squares). In the strongly-convex case, we provide an asymptotic expansion up to explicit exponentially decaying terms. Our analysis leads to new insights into stochastic approximation algorithms: (a) it gives a tighter bound on the allowed step-size; (b) the generalization error may be divided into a variance term which is decaying as O(1/n), independently of the step-size $\gamma$, and a bias term that decays as O(1/$\gamma$ 2 n 2); (c) when allowing non-uniform sampling, the choice of a good sampling density depends on whether the variance or bias terms dominate. In particular, when the variance term dominates, optimal sampling densities do not lead to much gain, while when the bias term dominates, we can choose larger step-sizes that leads to significant improvements.

artificial intelligence, optimization problem, variance term, (18 more...)

arXiv.org Machine Learning

1412.0156

Country: Europe > France (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.36)

Add feedback