Regression
A Fused Elastic Net Logistic Regression Model for Multi-Task Binary Classification
Mitov, Venelin, Claassen, Manfred
Multi-task learning has shown to significantly enhance the performance of multiple related learning tasks in a variety of situations. We present the fused logistic regression, a sparse multi-task learning approach for binary classification. Specifically, we introduce sparsity inducing penalties over parameter differences of related logistic regression models to encode similarity across related tasks. The resulting joint learning task is cast into a form that lends itself to be efficiently optimized with a recursive variant of the alternating direction method of multipliers. We show results on synthetic data and describe the regime of settings where our multi-task approach achieves significant improvements over the single task learning approach and discuss the implications on applying the fused logistic regression in different real world settings.
A Comprehensive Approach to Universal Piecewise Nonlinear Regression Based on Trees
Vanli, N. Denizcan, Kozat, Suleyman S.
Abstract--In this paper, we investigate adaptive nonlinear regression and introduce tree based piecewise linear regression algorithms that are highly efficient and provide significantly improved performance with guaranteed upper bounds in an individual sequence manner. We use a tree notion in order to partition the space of regressors in a nested structure. The introduced algorithms adapt not only their regression functions but also the complete tree structure while achieving the performance of the "best" linear mixture of a doubly exponential number of partitions, with a computational complexity only polynomial in the number of nodes of the tree. While constructing these algorithms, we also avoid using any artificial "weighting" of models (with highly data dependent parameters) and, instead, directly minimize the final regression error, which is the ultimate performance goal. The introduced methods are generic such that they can readily incorporate different tree construction methods such as random trees in their framework and can use different regressor or partitioning functions as demonstrated in the paper. ONLINEAR adaptive filtering and regression are extensively investigated in the signal processing [1]-[19] and machine learning literatures [20]-[23], especially for applications where linear modeling [24], [25] is inadequate, hence, does not provide satisfactory results due to the structural constraint on linearity. Although nonlinear approaches can be more powerful than linear methods in modeling, they usually suffer from overfitting, stability and convergence issues [1], [26]-[28], which considerably limit their application to signal processing problems. These issues are especially exacerbated in adaptive filtering due to the presence of feedback, which is even hard to control for linear models [26], [27], [29]. Furthermore, for applications involving big data, which require to process input vectors with considerably large dimensions, nonlinear models are usually avoided due to unmanageable computational complexity increase [30].
A regression model with a hidden logistic process for feature extraction from time series
Chamroukhi, Faicel, Samé, Allou, Govaert, Gérard, Aknin, Patrice
A new approach for feature extraction from time series is proposed in this paper. This approach consists of a specific regression model incorporating a discrete hidden logistic process. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. A piecewise regression algorithm and its iterative variant have also been considered for comparisons. An experimental study using simulated and real data reveals good performances of the proposed approach.
A hidden process regression model for functional data description. Application to curve discrimination
Chamroukhi, Faicel, Samé, Allou, Govaert, Gérard, Aknin, Patrice
A new approach for functional data description is proposed in this paper. It consists of a regression model with a discrete hidden logistic process which is adapted for modeling curves with abrupt or smooth regime changes. The model parameters are estimated in a maximum likelihood framework through a dedicated Expectation Maximization (EM) algorithm. From the proposed generative model, a curve discrimination rule is derived using the Maximum A Posteriori rule. The proposed model is evaluated using simulated curves and real world curves acquired during railway switch operations, by performing comparisons with the piecewise regression approach in terms of curve modeling and classification.
Mixture model-based functional discriminant analysis for curve classification
Chamroukhi, Faicel, Glotin, Hervé
Statistical approaches for Functional Data Analysis concern the paradigm for which the individuals are functions or curves rather than finite dimensional vectors. In this paper, we particularly focus on the modeling and the classification of functional data which are temporal curves presenting regime changes over time. More specifically, we propose a new mixture model-based discriminant analysis approach for functional data using a specific hidden process regression model. Our approach is particularly adapted to both handle the problem of complex-shaped classes of curves, where each class is composed of several sub-classes, and to deal with the regime changes within each homogeneous sub-class. The model explicitly integrates the heterogeneity of each class of curves via a mixture model formulation, and the regime changes within each sub-class through a hidden logistic process. The approach allows therefore for fitting flexible curve-models to each class of complex-shaped curves presenting regime changes through an unsupervised learning scheme, to automatically summarize it into a finite number of homogeneous clusters, each of them is decomposed into several regimes. The model parameters are learned by maximizing the observed-data log-likelihood for each class by using a dedicated expectation-maximization (EM) algorithm. Comparisons on simulated data and real data with alternative approaches, including functional linear discriminant analysis and functional mixture discriminant analysis with polynomial regression mixtures and spline regression mixtures, show that the proposed approach provides better results regarding the discrimination results and significantly improves the curves approximation.
Supervised learning of a regression model based on latent process. Application to the estimation of fuel cell life time
Onanena, Raïssa, Chamroukhi, Faicel, Oukhellou, Latifa, Candusso, Denis, Aknin, Patrice, Hissel, Daniel
This paper describes a pattern recognition approach aiming to estimate fuel cell duration time from electrochemical impedance spectroscopy measurements. It consists in first extracting features from both real and imaginary parts of the impedance spectrum. A parametric model is considered in the case of the real part, whereas regression model with latent variables is used in the latter case. Then, a linear regression model using different subsets of extracted features is used fo r the estimation of fuel cell time duration. The performances of the proposed approach are evaluated on experimental data set to show its feasibility. This could lead to interesting perspectives for predictive maintenance policy of fuel cell.
A regression model with a hidden logistic process for signal parametrization
Chamroukhi, Faicel, Samé, Allou, Govaert, Gérard, Aknin, Patrice
A new approach for signal parametrization, which consists of a specific regression model incorporating a discrete hidden logistic process, is proposed. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The parameters of the hidden logistic process, in the inner loop of the EM algorithm, are estimated using a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm. An experimental study using simulated and real data reveals good performances of the proposed approach.
Mod\`ele \`a processus latent et algorithme EM pour la r\'egression non lin\'eaire
Chamroukhi, Faicel, Samé, Allou, Govaert, Gérard, Aknin, Patrice
A non linear regression approach which consists of a specific regression model incorporating a latent process, allowing various polynomial regression models to be activated preferentially and smoothly, is introduced in this paper. The model parameters are estimated by maximum likelihood performed via a dedicated expecation-maximization (EM) algorithm. An experimental study using simulated and real data sets reveals good performances of the proposed approach.
Time series modeling by a regression approach based on a latent process
Chamroukhi, Faicel, Samé, Allou, Govaert, Gérard, Aknin, Patrice
Time series are used in many domains including finance, engineering, economics and bioinformatics generally to represent the change of a measurement over time. Modeling techniques may then be used to give a synthetic representation of such data. A new approach for time series modeling is proposed in this paper. It consists of a regression model incorporating a discrete hidden logistic process allowing for activating smoothly or abruptly different polynomial regression models. The model parameters are estimated by the maximum likelihood method performed by a dedicated Expectation Maximization (EM) algorithm. The M step of the EM algorithm uses a multi-class Iterative Reweighted Least-Squares (IRLS) algorithm to estimate the hidden process parameters. To evaluate the proposed approach, an experimental study on simulated data and real world data was performed using two alternative approaches: a heteroskedastic piecewise regression model using a global optimization algorithm based on dynamic programming, and a Hidden Markov Regression Model whose parameters are estimated by the Baum-Welch algorithm. Finally, in the context of the remote monitoring of components of the French railway infrastructure, and more particularly the switch mechanism, the proposed approach has been applied to modeling and classifying time series representing the condition measurements acquired during switch operations.
Model-based functional mixture discriminant analysis with hidden process regression for curve classification
Chamroukhi, Faicel, Glotin, Hervé, Samé, Allou
In this paper, we study the modeling and the classification of functional data presenting regime changes over time. We propose a new model-based functional mixture discriminant analysis approach based on a specific hidden process regression model that governs the regime changes over time. Our approach is particularly adapted to handle the problem of complex-shaped classes of curves, where each class is potentially composed of several sub-classes, and to deal with the regime changes within each homogeneous sub-class. The proposed model explicitly integrates the heterogeneity of each class of curves via a mixture model formulation, and the regime changes within each sub-class through a hidden logistic process. Each class of complex-shaped curves is modeled by a finite number of homogeneous clusters, each of them being decomposed into several regimes. The model parameters of each class are learned by maximizing the observed-data log-likelihood by using a dedicated expectation-maximization (EM) algorithm. Comparisons are performed with alternative curve classification approaches, including functional linear discriminant analysis and functional mixture discriminant analysis with polynomial regression mixtures and spline regression mixtures. Results obtained on simulated data and real data show that the proposed approach outperforms the alternative approaches in terms of discrimination, and significantly improves the curves approximation.