Gutowski, Nicolas
A Transformer Model for Predicting Chemical Reaction Products from Generic Templates
Ozer, Derin, Lamprier, Sylvain, Cauchy, Thomas, Gutowski, Nicolas, Da Mota, Benoit
The accurate prediction of chemical reaction outcomes is a major challenge in computational chemistry. Current models rely heavily on either highly specific reaction templates or template-free methods, both of which present limitations. To address these limitations, this work proposes the Broad Reaction Set (BRS), a dataset featuring 20 generic reaction templates that allow for the efficient exploration of the chemical space. Additionally, ProPreT5 is introduced, a T5 model tailored to chemistry that achieves a balance between rigid templates and template-free methods. ProPreT5 demonstrates its capability to generate accurate, valid, and realistic reaction products, making it a promising solution that goes beyond the current state-of-the-art on the complex reaction product prediction task.
Byte Pair Encoding for Symbolic Music
Fradet, Nathan, Gutowski, Nicolas, Chhel, Fabien, Briot, Jean-Pierre
When used with deep learning, the symbolic music modality is often coupled with language model architectures. To do so, the music needs to be tokenized, i.e. converted into a sequence of discrete tokens. This can be achieved by different approaches, as music can be composed of simultaneous tracks, of simultaneous notes with several attributes. Until now, the proposed tokenizations rely on small vocabularies of tokens describing the note attributes and time events, resulting in fairly long token sequences, and a sub-optimal use of the embedding space of language models. Recent research has put efforts on reducing the overall sequence length by merging embeddings or combining tokens. In this paper, we show that Byte Pair Encoding, a compression technique widely used for natural language, significantly decreases the sequence length while increasing the vocabulary size. By doing so, we leverage the embedding capabilities of such models with more expressive tokens, resulting in both better results and faster inference in generation and classification tasks. The source code is shared on Github, along with a companion website. Finally, BPE is directly implemented in MidiTok, allowing the reader to easily benefit from this method.
miditok: A Python package for MIDI file tokenization
Fradet, Nathan, Briot, Jean-Pierre, Chhel, Fabien, Seghrouchni, Amal El Fallah, Gutowski, Nicolas
Recent progress in natural language processing has been adapted to the symbolic music modality. Language models, such as Transformers, have been used with symbolic music for a variety of tasks among which music generation, modeling or transcription, with state-of-the-art performances. These models are beginning to be used in production products. To encode and decode music for the backbone model, they need to rely on tokenizers, whose role is to serialize music into sequences of distinct elements called tokens. MidiTok is an open-source library allowing to tokenize symbolic music with great flexibility and extended features. It features the most popular music tokenizations, under a unified API. It is made to be easily used and extensible for everyone.
Impact of time and note duration tokenizations on deep learning symbolic music modeling
Fradet, Nathan, Gutowski, Nicolas, Chhel, Fabien, Briot, Jean-Pierre
Symbolic music is widely used in various deep learning tasks, including generation, transcription, synthesis, and Music Information Retrieval (MIR). It is mostly employed with discrete models like Transformers, which require music to be tokenized, i.e., formatted into sequences of distinct elements called tokens. Tokenization can be performed in different ways. As Transformer can struggle at reasoning, but capture more easily explicit information, it is important to study how the way the information is represented for such model impact their performances. In this work, we analyze the common tokenization methods and experiment with time and note duration representations. We compare the performances of these two impactful criteria on several tasks, including composer and emotion classification, music generation, and sequence representation learning. We demonstrate that explicit information leads to better results depending on the task.
Partial Bandit and Semi-Bandit: Making the Most Out of Scarce Users' Feedback
Letard, Alexandre, Amghar, Tassadit, Camp, Olivier, Gutowski, Nicolas
Machine learning-based recommendation system are commonly used in various fields of activities [1]. Among the methods used for recommendation, those relying on Multi-Armed Bandit (MAB) approaches obtain interesting results in terms of global accuracy [1, 2]. This is more specifically the case with Combinatorial Multi-Armed Bandit (COM-MAB) [3]. From an industrial perspective, some fields of activities, as sailing and yachting [4], are initiating a digital transformation in order to provide such intelligent recommendation to their customers. Issues relative to both smart homes and smart vehicles can be encountered in the field of mobile housing, to which seafaring belongs [4]. Indeed, there are multiple ways of using a recreational vehicle such as a boat: as a main or secondary residence, a means of transport or even as a way for pushing one's limits. Those scenarios depend on each user and handling context. In recent years, several works have been carried out to promote the digitization of boating [5, 6]. However, those studies essentially deal with navigation automation, usually disregarding the other facets of such recreational vehicles.