Zhou, Tongxue, Ruan, Su, Canu, Stéphane

Multi-modality is widely used in medical imaging, because it can provide multiinformation about a target (tumor, organ or tissue). Segmentation using multimodality consists of fusing multi-information to improve the segmentation. Recently, deep learning-based approaches have presented the state-of-the-art performance in image classification, segmentation, object detection and tracking tasks. Due to their self-learning and generalization ability over large amounts of data, deep learning recently has also gained great interest in multi-modal medical image segmentation. In this paper, we give an overview of deep learning-based approaches for multi-modal medical image segmentation task. Firstly, we introduce the general principle of deep learning and multi-modal medical image segmentation. Secondly, we present different deep learning network architectures, then analyze their fusion strategies and compare their results. The earlier fusion is commonly used, since it's simple and it focuses on the subsequent segmentation network architecture. However, the later fusion gives more attention on fusion strategy to learn the complex relationship between different modalities. In general, compared to the earlier fusion, the later fusion can give more accurate result if the fusion method is effective enough. We also discuss some common problems in medical image segmentation. Finally, we summarize and provide some perspectives on the future research.

Grandvalet, Yves, Rakotomamonjy, Alain, Keshet, Joseph, Canu, Stéphane

We consider the problem of binary classification where the classifier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow's rule, is defined by two thresholds on posterior probabilities. From simple desiderata, namely the consistency and the sparsity of the classifier, we derive the double hinge loss function that focuses on estimating conditional probabilities only in the vicinity of the threshold points of the optimal decision rule. We show that, for suitable kernel machines, our approach is universally consistent. We cast the problem of minimizing the double hinge loss as a quadratic program akin to the standard SVM optimization problem and propose an active set method to solve it efficiently.

Blin, Rachel, Ainouz, Samia, Canu, Stéphane, Meriaudeau, Fabrice

Road scenes analysis in adverse weather conditions by polarization-encoded images and adapted deep learning Rachel Blin 1, Samia Ainouz 1, St ephane Canu 1 and Fabrice Meriaudeau 2 Abstract -- Object detection in road scenes is necessary to develop both autonomous vehicles and driving assistance systems. Even if deep neural networks for recognition task have shown great performances using conventional images, they fail to detect objects in road scenes in complex acquisition situations. In contrast, polarization images, characterizing the light wave, can robustly describe important physical properties of the object even under poor illumination or strong reflections. This paper shows how non-conventional polarimetric imaging modality overcomes the classical methods for object detection especially in adverse weather conditions. The efficiency of the proposed method is mostly due to the high power of the polarimetry to discriminate any object by its reflective properties and on the use of deep neural networks for object detection. Our goal by this work, is to prove that polarimetry brings a real added value compared with RGB images for object detection. Experimental results on our own dataset composed of road scene images taken during adverse weather conditions show that polarimetry together with deep learning can improve the state-of-the-art by about 20% to 50% on different detection tasks.

Guevara, Jorge, Hirata, Roberto Jr, Canu, Stéphane

This paper introduces the concept of kernels on fuzzy sets as a similarity measure for $[0,1]$-valued functions, a.k.a. \emph{membership functions of fuzzy sets}. We defined the following classes of kernels: the cross product, the intersection, the non-singleton and the distance-based kernels on fuzzy sets. Applicability of those kernels are on machine learning and data science tasks where uncertainty in data has an ontic or epistemistic interpretation.

Debard, Quentin, Dibangoye, Jilles Steeve, Canu, Stéphane, Wolf, Christian

Using touch devices to navigate in virtual 3D environments such as computer assisted design (CAD) models or geographical information systems (GIS) is inherently difficult for humans, as the 3D operations have to be performed by the user on a 2D touch surface. This ill-posed problem is classically solved with a fixed and handcrafted interaction protocol, which must be learned by the user. We propose to automatically learn a new interaction protocol allowing to map a 2D user input to 3D actions in virtual environments using reinforcement learning (RL). A fundamental problem of RL methods is the vast amount of interactions often required, which are difficult to come by when humans are involved. To overcome this limitation, we make use of two collaborative agents. The first agent models the human by learning to perform the 2D finger trajectories. The second agent acts as the interaction protocol, interpreting and translating to 3D operations the 2D finger trajectories from the first agent. We restrict the learned 2D trajectories to be similar to a training set of collected human gestures by first performing state representation learning, prior to reinforcement learning. This state representation learning is addressed by projecting the gestures into a latent space learned by a variational auto encoder (VAE).

Debard, Quentin, Wolf, Christian, Canu, Stéphane, Arné, Julien

We propose a fully automatic method for learning gestures on big touch devices in a potentially multi-user context. The goal is to learn general models capable of adapting to different gestures, user styles and hardware variations (e.g. device sizes, sampling frequencies and regularities). Based on deep neural networks, our method features a novel dynamic sampling and temporal normalization component, transforming variable length gestures into fixed length representations while preserving finger/surface contact transitions, that is, the topology of the signal. This sequential representation is then processed with a convolutional model capable, unlike recurrent networks, of learning hierarchical representations with different levels of abstraction. To demonstrate the interest of the proposed method, we introduce a new touch gestures dataset with 6591 gestures performed by 27 people, which is, up to our knowledge, the first of its kind: a publicly available multi-touch gesture dataset for interaction. We also tested our method on a standard dataset of symbolic touch gesture recognition, the MMG dataset, outperforming the state of the art and reporting close to perfect performance.

Kadri, Hachem, Duflos, Emmanuel, Preux, Philippe, Canu, Stéphane, Rakotomamonjy, Alain, Audiffren, Julien

In this paper we consider the problems of supervised classification and regression in the case where attributes and labels are functions: a data is represented by a set of functions, and the label is also a function. We focus on the use of reproducing kernel Hilbert space theory to learn from such functional data. Basic concepts and properties of kernel-based learning are extended to include the estimation of function-valued functions. In this setting, the representer theorem is restated, a set of rigorously defined infinite-dimensional operator-valued kernels that can be valuably applied when the data are functions is described, and a learning algorithm for nonlinear functional data analysis is introduced. The methodology is illustrated through speech and audio signal processing experiments.

Kadri, Hachem, Preux, Philippe, Duflos, Emmanuel, Canu, Stéphane

In this paper we present a nonparametric method for extending functional regression methodology to the situation where more than one functional covariate is used to predict a functional response. Borrowing the idea from Kadri et al. (2010a), the method, which support mixed discrete and continuous explanatory variables, is based on estimating a function-valued function in reproducing kernel Hilbert spaces by virtue of positive operator-valued kernels.

Grandvalet, Yves, Rakotomamonjy, Alain, Keshet, Joseph, Canu, Stéphane

We consider the problem of binary classification where the classifier may abstain instead of classifying each observation. The Bayes decision rule for this setup, known as Chow's rule, is defined by two thresholds on posterior probabilities. From simple desiderata, namely the consistency and the sparsity of the classifier, we derive the double hinge loss function that focuses on estimating conditional probabilities only in the vicinity of the threshold points of the optimal decision rule. We show that, for suitable kernel machines, our approach is universally consistent. We cast the problem of minimizing the double hinge loss as a quadratic program akin to the standard SVM optimization problem and propose an active set method to solve it efficiently. We finally provide preliminary experimental results illustrating the interest of our constructive approach to devising loss functions.

Grandvalet, Yves, Canu, Stéphane

This paper introduces an algorithm for the automatic relevance determination ofinput variables in kernelized Support Vector Machines. Relevance is measured by scale factors defining the input space metric, and feature selection is performed by assigning zero weights to irrelevant variables. The metric is automatically tuned by the minimization of the standard SVM empirical risk, where scale factors are added to the usual set of parameters defining the classifier. Feature selection is achieved by constraints encouraging the sparsity of scale factors. The resulting algorithm compares favorably to state-of-the-art feature selection procedures anddemonstrates its effectiveness on a demanding facial expression recognition problem.