Ziou, Djemel
Hierarchical mixture of discriminative Generalized Dirichlet classifiers
Togban, Elvis, Ziou, Djemel
This paper presents a discriminative classifier for compositional data. This classifier is based on the posterior distribution of the Generalized Dirichlet which is the discriminative counterpart of Generalized Dirichlet mixture model. Moreover, following the mixture of experts paradigm, we proposed a hierarchical mixture of this classifier. In order to learn the models parameters, we use a variational approximation by deriving an upper-bound for the Generalized Dirichlet mixture. To the best of our knownledge, this is the first time this bound is proposed in the literature. Experimental results are presented for spam detection and color space identification.
Deriving Lehmer and H\"older means as maximum weighted likelihood estimates for the multivariate exponential family
Ziou, Djemel, Fakir, Issam
Consider numerical observations; it is common to calculate their mean and refer to it as central tendency. There are, however, different measures of mean [4]. These measurements are sometimes grouped into families, like Lehmer and Hรถlder. Distinguishing these measures and better understanding their use involves identifying the link between them and probability density functions (PDFs). For example, the arithmetic mean is the maximum likelihood estimator (MLE) of the position parameter for the normal PDF and the scale parameter for the exponential PDF. For the families of Lehmer and Hรถlder means, such an interpretation has only recently been proposed for the case of PDFs in the case of the univariate exponential family Let's consider digital observations; it is often common to calculate their mean and designate it as a central tendency. However, there are various measures of the average [2]. These measures are sometimes grouped into families, such as Lehmer and Hรถlder.
Centrality Estimators for Probability Density Functions
Ziou, Djemel
In this report, we explore the data selection leading to a family of estimators maximizing a centrality. The family allows a nice properties leading to accurate and robust probability density function fitting according to some criteria we define. We establish a link between the centrality estimator and the maximum likelihood, showing that the latter is a particular case. Therefore, a new probability interpretation of Fisher maximum likelihood is provided. We will introduce and study two specific centralities that we have named H\"older and Lehmer estimators. A numerical simulation is provided showing the effectiveness of the proposed families of estimators opening the door to development of new concepts and algorithms in machine learning, data mining, statistics, and data analysis.
Prediction of rare events in the operation of household equipment using co-evolving time series
Mecheri, Hadia, Benamirouche, Islam, Fass, Feriel, Ziou, Djemel, Kadri, Nassima
In this study, we propose an approach for predicting rare events by exploiting time series in coevolution. Our approach involves a weighted autologistic regression model, where we leverage the temporal behavior of the data to enhance predictive capabilities. By addressing the issue of imbalanced datasets, we establish constraints leading to weight estimation and to improved performance. Evaluation on synthetic and real-world datasets confirms that our approach outperform state-of-the-art of predicting home equipment failure methods.
Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data
Boutemedjet, Sabri, Ziou, Djemel, Bouguila, Nizar
Content-based image suggestion (CBIS) targets the recommendation of products based on user preferences on the visual content of images. In this paper, we motivate both feature selection and model order identification as two key issues for a successful CBIS. We propose a generative model in which the visual features and users are clustered into separate classes. We identify the number of both user and image classes with the simultaneous selection of relevant visual features using the message length approach. The goal is to ensure an accurate prediction of ratings for multidimensional non-Gaussian and continuous image descriptors. Experiments on a collected data have demonstrated the merits of our approach.