AITopics | Pascal, Frédéric

Collaborating Authors

Pascal, Frédéric

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Elliptical Wishart distributions: information geometry, maximum likelihood estimator, performance analysis and statistical learning

Ayadi, Imen, Bouchard, Florent, Pascal, Frédéric

arXiv.org Machine LearningNov-4-2024

This paper deals with Elliptical Wishart distributions - which generalize the Wishart distribution - in the context of signal processing and machine learning. Two algorithms to compute the maximum likelihood estimator (MLE) are proposed: a fixed point algorithm and a Riemannian optimization method based on the derived information geometry of Elliptical Wishart distributions. The existence and uniqueness of the MLE are characterized as well as the convergence of both estimation algorithms. Statistical properties of the MLE are also investigated such as consistency, asymptotic normality and an intrinsic version of Fisher efficiency. On the statistical learning side, novel classification and clustering methods are designed. For the $t$-Wishart distribution, the performance of the MLE and statistical learning algorithms are evaluated on both simulated and real EEG and hyperspectral data, showcasing the interest of our proposed methods.

artificial intelligence, machine learning, wishart distribution, (17 more...)

arXiv.org Machine Learning

2411.02726

Country:

North America > United States > California (0.14)
Europe (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.76)

Add feedback

Random matrix theory improved Fr\'echet mean of symmetric positive definite matrices

Bouchard, Florent, Mian, Ammar, Tiomoko, Malik, Ginolhac, Guillaume, Pascal, Frédéric

arXiv.org Machine LearningJun-5-2024

This mean is used, for example, for nearest centroid (Tuzel et al., 2008), pooling in SPD In this study, we consider the realm of covariance deep learning networks (Brooks et al., 2019) and metric matrices in machine learning, particularly focusing learning (Zadeh et al., 2016). The optimal solution is not on computing Fréchet means on the manifold available analytically necessitating the use of iterative algorithms of symmetric positive definite matrices, commonly often based on deriving a Riemannian gradient referred to as Karcher or geometric means. (Boumal, 2023). These algorithms are grounded in Riemannian Such means are leveraged in numerous machine geometry, since matrices belong to specific manifolds learning tasks. Relying on advanced statistical depending on their specific properties (fair SPD, low rank, tools, we introduce a random matrix theory based etc.) and the chosen metric. The geometry is often the classical method that estimates Fréchet means, which is one given for SPD matrices, but alternatives geometries particularly beneficial when dealing with low are available to perform this algorithm such as Bures-sample support and a high number of matrices Wassertein (Han et al., 2021), log-Euclidean (Utpala et al., to average.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

2405.06558

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Sparse PCA with False Discovery Rate Controlled Variable Selection

Machkour, Jasin, Breloy, Arnaud, Muma, Michael, Palomar, Daniel P., Pascal, Frédéric

arXiv.org Artificial IntelligenceJan-16-2024

Sparse principal component analysis (PCA) aims at mapping large dimensional data to a linear subspace of lower dimension. By imposing loading vectors to be sparse, it performs the double duty of dimension reduction and variable selection. Sparse PCA algorithms are usually expressed as a trade-off between explained variance and sparsity of the loading vectors (i.e., number of selected variables). As a high explained variance is not necessarily synonymous with relevant information, these methods are prone to select irrelevant variables. To overcome this issue, we propose an alternative formulation of sparse PCA driven by the false discovery rate (FDR). We then leverage the Terminating-Random Experiments (T-Rex) selector to automatically determine an FDR-controlled support of the loading vectors. A major advantage of the resulting T-Rex PCA is that no sparsity parameter tuning is required. Numerical experiments and a stock market data example demonstrate a significant performance improvement.

artificial intelligence, loading vector, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2401.08375

Country:

North America > United States (0.46)
Europe > Germany > Hesse (0.14)

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Convex Parameter Estimation of Perturbed Multivariate Generalized Gaussian Distributions

Ouzir, Nora, Pascal, Frédéric, Pesquet, Jean-Christophe

arXiv.org Machine LearningDec-12-2023

The multivariate generalized Gaussian distribution (MGGD), also known as the multivariate exponential power (MEP) distribution, is widely used in signal and image processing. However, estimating MGGD parameters, which is required in practical applications, still faces specific theoretical challenges. In particular, establishing convergence properties for the standard fixed-point approach when both the distribution mean and the scatter (or the precision) matrix are unknown is still an open problem. In robust estimation, imposing classical constraints on the precision matrix, such as sparsity, has been limited by the non-convexity of the resulting cost function. This paper tackles these issues from an optimization viewpoint by proposing a convex formulation with well-established convergence properties. We embed our analysis in a noisy scenario where robustness is induced by modelling multiplicative perturbations. The resulting framework is flexible as it combines a variety of regularizations for the precision matrix, the mean and model perturbations. This paper presents proof of the desired theoretical properties, specifies the conditions preserving these properties for different regularization choices and designs a general proximal primal-dual optimization strategy. The experiments show a more accurate precision and covariance matrix estimation with similar performance for the mean vector parameter compared to Tyler's M-estimator. In a high-dimensional setting, the proposed method outperforms the classical GLASSO, one of its robust extensions, and the regularized Tyler's estimator.

convex parameter estimation, perturbed multivariate generalized gaussian distribution

arXiv.org Machine Learning

2312.07479

Genre: Research Report (0.69)

Technology:

Information Technology > Modeling & Simulation (0.60)
Information Technology > Sensing and Signal Processing > Image Processing (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Choosing the parameter of the Fermat distance: navigating geometry and noise

Chazal, Frédéric, Ferraris, Laure, Groisman, Pablo, Jonckheere, Matthieu, Pascal, Frédéric, Sapienza, Facundo

arXiv.org Machine LearningNov-30-2023

The Fermat distance has been recently established as a useful tool for machine learning tasks when a natural distance is not directly available to the practitioner or to improve the results given by Euclidean distances by exploding the geometrical and statistical properties of the dataset. This distance depends on a parameter $\alpha$ that greatly impacts the performance of subsequent tasks. Ideally, the value of $\alpha$ should be large enough to navigate the geometric intricacies inherent to the problem. At the same, it should remain restrained enough to sidestep any deleterious ramifications stemming from noise during the process of distance estimation. We study both theoretically and through simulations how to select this parameter.

artificial intelligence, fermat distance, machine learning, (18 more...)

arXiv.org Machine Learning

2311.18663

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)

Add feedback

A Robust and Flexible EM Algorithm for Mixtures of Elliptical Distributions with Missing Data

Mouret, Florian, Hippert-Ferrer, Alexandre, Pascal, Frédéric, Tourneret, Jean-Yves

arXiv.org Machine LearningJan-28-2022

This paper tackles the problem of missing data imputation for noisy and non-Gaussian data. A classical imputation method, the Expectation Maximization (EM) algorithm for Gaussian mixture models, has shown interesting properties when compared to other popular approaches such as those based on k-nearest neighbors or on multiple imputations by chained equations. However, Gaussian mixture models are known to be not robust to heterogeneous data, which can lead to poor estimation performance when the data is contaminated by outliers or come from a non-Gaussian distributions. To overcome this issue, a new expectation maximization algorithm is investigated for mixtures of elliptical distributions with the nice property of handling potential missing data. The complete-data likelihood associated with mixtures of elliptical distributions is well adapted to the EM framework thanks to its conditional distribution, which is shown to be a Student distribution. Experimental results on synthetic data demonstrate that the proposed algorithm is robust to outliers and can be used with non-Gaussian data. Furthermore, experiments conducted on real-world datasets show that this algorithm is very competitive when compared to other classical imputation methods.

artificial intelligence, health & medicine, machine learning, (19 more...)

arXiv.org Machine Learning

2201.1202

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Food & Agriculture > Agriculture (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Robust classification with flexible discriminant analysis in heterogeneous data

Houdouin, Pierre, Pascal, Frédéric, Jonckheere, Matthieu, Wang, Andrew

arXiv.org Machine LearningJan-9-2022

Linear and Quadratic Discriminant Analysis are well-known The new method called Generalized QDA (GQDA) classical methods but can heavily suffer from non-Gaussian relies on the estimation of a threshold parameter, whose optimal distributions and/or contaminated datasets, mainly because of value is fixed for each sub-family of distribution. The the underlying Gaussian assumption that is not robust. To fill case c 1 corresponds to the Gaussian case. Finally, [10] improved this gap, this paper presents a new robust discriminant analysis the previous work by adding robust estimators, coming where each data point is drawn by its own arbitrary Elliptically up with the Robust GQDA (RGQDA) method. Symmetrical (ES) distribution and its own arbitrary All these methods assume that all clusters belong to the scale parameter. Such a model allows for possibly very heterogeneous, same distribution family. In practice, such an hypothesis may independent but non-identically distributed samples.

artificial intelligence, contamination rate, machine learning, (16 more...)

arXiv.org Machine Learning

2201.02967

Country: Europe (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

PCA-based Multi Task Learning: a Random Matrix Approach

Tiomoko, Malik, Couillet, Romain, Pascal, Frédéric

arXiv.org Machine LearningNov-1-2021

The article proposes and theoretically analyses a \emph{computationally efficient} multi-task learning (MTL) extension of popular principal component analysis (PCA)-based supervised learning schemes \cite{barshan2011supervised,bair2006prediction}. The analysis reveals that (i) by default learning may dramatically fail by suffering from \emph{negative transfer}, but that (ii) simple counter-measures on data labels avert negative transfer and necessarily result in improved performances. Supporting experiments on synthetic and real data benchmarks show that the proposed method achieves comparable performance with state-of-the-art MTL methods but at a \emph{significantly reduced computational cost}.

artificial intelligence, learning, machine learning, (14 more...)

arXiv.org Machine Learning

2111.00924

Country: Europe > France (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Riemannian classification of EEG signals with missing values

Hippert-Ferrer, Alexandre, Mian, Ammar, Bouchard, Florent, Pascal, Frédéric

arXiv.org Machine LearningOct-19-2021

This paper proposes two strategies to handle missing data for the classification of electroencephalograms using covariance matrices. The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm. Both approaches are combined with the minimum distance to Riemannian mean classifier and applied to a classification task of event related-potentials, a widely known paradigm of brain-computer interface paradigms. As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.

artificial intelligence, health & medicine, machine learning, (20 more...)

arXiv.org Machine Learning

2110.10011

Country: Europe (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.49)

Add feedback

A flexible EM-like clustering algorithm for noisy data

Roizman, Violeta, Jonckheere, Matthieu, Pascal, Frédéric

arXiv.org Machine LearningJul-2-2019

We design a new robust clustering algorithm that can deal efficiently with noise and outliers in diverse datasets. As an EM-like algorithm, it is based on both estimations of clusters centers and covariances but also on a scale parameter per data-point. This allows the algorithm to accommodate for heavier/lighter tails distributions (in comparison to classical Gaussian distributions) and outliers without significantly loosing efficiency in classical scenarios. Convergence and accuracy of the algorithm are first analyzed by considering synthetic data. Then, we show that the proposed algorithm outperforms other classical unsupervised methods of the literature such as k-means, the EM algorithm and HDBSCAN when applied to real datasets as MNIST, NORB and 20newsgroups.

algorithm, artificial intelligence, data mining, (19 more...)

arXiv.org Machine Learning

1907.0166

Country: North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback