AITopics

2504.01481

Country:

North America > United States (0.46)
Europe > France (0.29)

Genre: Research Report > New Finding (0.86)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

arXiv.org Machine LearningFeb-12-2025

Wrapped Gaussian on the manifold of Symmetric Positive Definite Matrices

de Surrel, Thibault, Lotte, Fabien, Chevallier, Sylvain, Yger, Florian

Circular and non-flat data distributions are prevalent across diverse domains of data science, yet their specific geometric structures often remain underutilized in machine learning frameworks. A principled approach to accounting for the underlying geometry of such data is pivotal, particularly when extending statistical models, like the pervasive Gaussian distribution. In this work, we tackle those issue by focusing on the manifold of symmetric positive definite matrices, a key focus in information geometry. We introduced a non-isotropic wrapped Gaussian by leveraging the exponential map, we derive theoretical properties of this distribution and propose a maximum likelihood framework for parameter estimation. Furthermore, we reinterpret established classifiers on SPD through a probabilistic lens and introduce new classifiers based on the wrapped Gaussian model. Experiments on synthetic and real-world datasets demonstrate the robustness and flexibility of this geometry-aware distribution, underscoring its potential to advance manifold-based data analysis. This work lays the groundwork for extending classical machine learning and statistical methods to more complex and structured data.

artificial intelligence, bayesian inference, machine learning, (20 more...)

2502.01512

Country:

North America > United States (0.68)
Europe > France > Normandy (0.14)

Genre: Research Report (0.63)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
(2 more...)

arXiv.org Artificial IntelligenceOct-18-2024

Automatic Classification of Sleep Stages from EEG Signals Using Riemannian Metrics and Transformer Networks

Seraphim, Mathieu, Lechervy, Alexis, Yger, Florian, Brun, Luc, Etard, Olivier

Purpose: In sleep medicine, assessing the evolution of a subject's sleep often involves the costly manual scoring of electroencephalographic (EEG) signals. In recent years, a number of Deep Learning approaches have been proposed to automate this process, mainly by extracting features from said signals. However, despite some promising developments in related problems, such as Brain-Computer Interfaces, analyses of the covariances between brain regions remain underutilized in sleep stage scoring.Methods: Expanding upon our previous work, we investigate the capabilities of SPDTransNet, a Transformer-derived network designed to classify sleep stages from EEG data through timeseries of covariance matrices. Furthermore, we present a novel way of integrating learned signal-wise features into said matrices without sacrificing their Symmetric Definite Positive (SPD) nature.Results: Through comparison with other State-of-the-Art models within a methodology optimized for class-wise performance, we achieve a level of performance at or beyond various State-of-the-Art models, both in single-dataset and - particularly - multi-dataset experiments.Conclusion: In this article, we prove the capabilities of our SPDTransNet model, particularly its adaptability to multi-dataset tasks, within the context of EEG sleep stage scoring - though it could easily be adapted to any classification task involving timeseries of covariance matrices.

artificial intelligence, machine learning, matrix, (17 more...)

2410.19819

Country: Europe > France (0.68)

Genre: Research Report > Promising Solution (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-14-2023

Structure-Preserving Transformers for Sequences of SPD Matrices

Seraphim, Mathieu, Lechervy, Alexis, Yger, Florian, Brun, Luc, Etard, Olivier

In recent years, Transformer-based auto-attention mechanisms have been successfully applied to the analysis of a variety of context-reliant data types, from texts to images and beyond, including data from non-Euclidean geometries. In this paper, we present such a mechanism, designed to classify sequences of Symmetric Positive Definite matrices while preserving their Riemannian geometry throughout the analysis. We apply our method to automatic sleep staging on timeseries of EEG-derived covariance matrices from a standard dataset, obtaining high levels of stage-wise performance.

artificial intelligence, machine learning, matrix, (19 more...)

2309.07579

Country: Europe (0.29)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

arXiv.org Artificial IntelligenceDec-12-2023

Meta-survey on outlier and anomaly detection

Olteanu, Madalina, Rossi, Fabrice, Yger, Florian

The impact of outliers and anomalies on model estimation and data processing is of paramount importance, as evidenced by the extensive body of research spanning various fields over several decades: thousands of research papers have been published on the subject. As a consequence, numerous reviews, surveys, and textbooks have sought to summarize the existing literature, encompassing a wide range of methods from both the statistical and data mining communities. While these endeavors to organize and summarize the research are invaluable, they face inherent challenges due to the pervasive nature of outliers and anomalies in all data-intensive applications, irrespective of the specific application field or scientific discipline. As a result, the resulting collection of papers remains voluminous and somewhat heterogeneous. To address the need for knowledge organization in this domain, this paper implements the first systematic meta-survey of general surveys and reviews on outlier and anomaly detection. Employing a classical systematic survey approach, the study collects nearly 500 papers using two specialized scientific search engines. From this comprehensive collection, a subset of 56 papers that claim to be general surveys on outlier detection is selected using a snowball search technique to enhance field coverage. A meticulous quality assessment phase further refines the selection to a subset of 25 high-quality general surveys. Using this curated collection, the paper investigates the evolution of the outlier detection field over a 20-year period, revealing emerging themes and methods. Furthermore, an analysis of the surveys sheds light on the survey writing practices adopted by scholars from different communities who have contributed to this field. Finally, the paper delves into several topics where consensus has emerged from the literature. These include taxonomies of outlier types, challenges posed by high-dimensional data, the importance of anomaly scores, the impact of learning conditions, difficulties in benchmarking, and the significance of neural networks. Non-consensual aspects are also discussed, particularly the distinction between local and global outliers and the challenges in organizing detection methods into meaningful taxonomies.

data mining, detection, machine learning, (19 more...)

doi: 10.1016/j.neucom.2023.126634

2312.07101

Country:

Europe (1.00)
North America > United States > Massachusetts (0.28)
Asia > Middle East > Israel > Mediterranean Sea (0.24)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

arXiv.org Artificial IntelligenceDec-27-2022

Challenges in anomaly and change point detection

Olteanu, Madalina, Rossi, Fabrice, Yger, Florian

This paper presents an introduction to the state-of-the-art in anomaly and change-point detection. On the one hand, the main concepts needed to understand the vast scientific literature on those subjects are introduced. On the other, a selection of important surveys and books, as well as two selected active research topics in the field, are presented.

data mining, detection, machine learning, (18 more...)

2212.1352

Country: Europe (0.47)

Genre: Overview (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.39)

arXiv.org Artificial IntelligenceJan-17-2022

Multi-winner Approval Voting Goes Epistemic

Allouche, Tahar, Lang, Jérôme, Yger, Florian

Epistemic voting interprets votes as noisy signals about a ground truth. We consider contexts where the truth consists of a set of objective winners, knowing a lower and upper bound on its cardinality. A prototypical problem for this setting is the aggregation of multi-label annotations with prior knowledge on the size of the ground truth. We posit noise models, for which we define rules that output an optimal set of winners. We report on experiments on multi-label annotations (which we collected).

artificial intelligence, health & medicine, machine learning, (19 more...)

2201.06655

Country: Europe (0.14)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry:

Health & Medicine (0.93)
Leisure & Entertainment > Sports > Soccer (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

arXiv.org Artificial IntelligenceDec-7-2021

Truth-tracking via Approval Voting: Size Matters

Allouche, Tahar, Lang, Jérôme, Yger, Florian

Epistemic social choice aims at unveiling a hidden ground truth given votes, which are interpreted as noisy signals about it. We consider here a simple setting where votes consist of approval ballots: each voter approves a set of alternatives which they believe can possibly be the ground truth. Based on the intuitive idea that more reliable votes contain fewer alternatives, we define several noise models that are approval voting variants of the Mallows model. The likelihood-maximizing alternative is then characterized as the winner of a weighted approval rule, where the weight of a ballot decreases with its cardinality. We have conducted an experiment on three image annotation datasets; they conclude that rules based on our noise model outperform standard approval voting; the best performance is obtained by a variant of the Condorcet noise model.

artificial intelligence, ballot, machine learning, (15 more...)

2112.04387

Country: North America > Canada (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.91)

arXiv.org Machine LearningJul-16-2021

Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences

Yamane, Ikko, Honda, Junya, Yger, Florian, Sugiyama, Masashi

Ordinary supervised learning is useful when we have paired training data of input $X$ and output $Y$. However, such paired data can be difficult to collect in practice. In this paper, we consider the task of predicting $Y$ from $X$ when we have no paired data of them, but we have two separate, independent datasets of $X$ and $Y$ each observed with some mediating variable $U$, that is, we have two datasets $S_X = \{(X_i, U_i)\}$ and $S_Y = \{(U'_j, Y'_j)\}$. A naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$ using $S_Y$, but we show that this is not statistically consistent. Moreover, predicting $U$ can be more difficult than predicting $Y$ in practice, e.g., when $U$ has higher dimensionality. To circumvent the difficulty, we propose a new method that avoids predicting $U$ but directly learns $Y = f(X)$ by training $f(X)$ with $S_{X}$ to predict $h(U)$ which is trained with $S_{Y}$ to approximate $Y$. We prove statistical consistency and error bounds of our method and experimentally confirm its practical usefulness.

artificial intelligence, joint-rr, neural network, (14 more...)

2107.08135

Country:

Europe (0.67)
Asia > Japan > Honshū (0.14)
North America > United States > Wisconsin (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Machine LearningJul-5-2021

Template-Based Graph Clustering

Riva, Mateus, Yger, Florian, Gori, Pietro, Cesar, Roberto M. Jr., Bloch, Isabelle

We propose a novel graph clustering method guided by additional information on the underlying structure of the clusters (or communities). The problem is formulated as the matching of a graph to a template with smaller dimension, hence matching $n$ vertices of the observed graph (to be clustered) to the $k$ vertices of a template graph, using its edges as support information, and relaxed on the set of orthonormal matrices in order to find a $k$ dimensional embedding. With relevant priors that encode the density of the clusters and their relationships, our method outperforms classical methods, especially for challenging cases.

artificial intelligence, graph, machine learning, (18 more...)

2107.01994

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)