Galtier, Mathieu
Industry-Scale Orchestrated Federated Learning for Drug Discovery
Oldenhof, Martijn, Ács, Gergely, Pejó, Balázs, Schuffenhauer, Ansgar, Holway, Nicholas, Sturm, Noé, Dieckmann, Arne, Fortmeier, Oliver, Boniface, Eric, Mayer, Clément, Gohier, Arnaud, Schmidtke, Peter, Niwayama, Ritsuya, Kopecky, Dieter, Mervin, Lewis, Rathi, Prakash Chandra, Friedrich, Lukas, Formanek, András, Antal, Peter, Rahaman, Jordon, Zalewski, Adam, Heyndrickx, Wouter, Oluoch, Ezron, Stößel, Manuel, Vančo, Michal, Endico, David, Gelus, Fabien, de Boisfossé, Thaïs, Darbier, Adrien, Nicollet, Ashley, Blottière, Matthieu, Telenczuk, Maria, Nguyen, Van Tien, Martinez, Thibaud, Boillet, Camille, Moutet, Kelvin, Picosson, Alexandre, Gasser, Aurélien, Djafar, Inal, Simon, Antoine, Arany, Ádám, Simm, Jaak, Moreau, Yves, Engkvist, Ola, Ceulemans, Hugo, Marini, Camille, Galtier, Mathieu
To apply federated learning to drug discovery we developed a novel platform in the context of European Innovative Medicines Initiative (IMI) project MELLODDY (grant n{\deg}831472), which was comprised of 10 pharmaceutical companies, academic research labs, large industrial companies and startups. The MELLODDY platform was the first industry-scale platform to enable the creation of a global federated model for drug discovery without sharing the confidential data sets of the individual partners. The federated model was trained on the platform by aggregating the gradients of all contributing partners in a cryptographic, secure way following each training iteration. The platform was deployed on an Amazon Web Services (AWS) multi-account architecture running Kubernetes clusters in private subnets. Organisationally, the roles of the different partners were codified as different rights and permissions on the platform and administrated in a decentralized way. The MELLODDY platform generated new scientific discoveries which are described in a companion paper.
A deep learning architecture for temporal sleep stage classification using multivariate and multimodal time series
Chambon, Stanislas, Galtier, Mathieu, Arnal, Pierrick, Wainrib, Gilles, Gramfort, Alexandre
Sleep stage classification constitutes an important preliminary exam in the diagnosis of sleep disorders. It is traditionally performed by a sleep expert who assigns to each 30s of signal a sleep stage, based on the visual inspection of signals such as electroencephalograms (EEG), electrooculograms (EOG), electrocardiograms (ECG) and electromyograms (EMG). We introduce here the first deep learning approach for sleep stage classification that learns end-to-end without computing spectrograms or extracting hand-crafted features, that exploits all multivariate and multimodal Polysomnography (PSG) signals (EEG, EMG and EOG), and that can exploit the temporal context of each 30s window of data. For each modality the first layer learns linear spatial filters that exploit the array of sensors to increase the signal-to-noise ratio, and the last layer feeds the learnt representation to a softmax classifier. Our model is compared to alternative automatic approaches based on convolutional networks or decisions trees. Results obtained on 61 publicly available PSG records with up to 20 EEG channels demonstrate that our network architecture yields state-of-the-art performance. Our study reveals a number of insights on the spatio-temporal distribution of the signal of interest: a good trade-off for optimal classification performance measured with balanced accuracy is to use 6 EEG with 2 EOG (left and right) and 3 EMG chin channels. Also exploiting one minute of data before and after each data segment offers the strongest improvement when a limited number of channels is available. As sleep experts, our system exploits the multivariate and multimodal nature of PSG signals in order to deliver state-of-the-art classification performance with a small computational cost.
Morpheo: Traceable Machine Learning on Hidden data
Galtier, Mathieu, Marini, Camille
Morpheo is a transparent and secure machine learning platform collecting and analysing large datasets. It aims at building state-of-the art prediction models in various fields where data are sensitive. Indeed, it offers strong privacy of data and algorithm, by preventing anyone to read the data, apart from the owner and the chosen algorithms. Computations in Morpheo are orchestrated by a blockchain infrastructure, thus offering total traceability of operations. Morpheo aims at building an attractive economic ecosystem around data prediction by channelling crypto-money from prediction requests to useful data and algorithms providers. Morpheo is designed to handle multiple data sources in a transfer learning approach in order to mutualize knowledge acquired from large datasets for applications with smaller but similar datasets.