Mathe, Johan
Beyond Euclid: An Illustrated Guide to Modern Machine Learning with Geometric, Topological, and Algebraic Structures
Sanborn, Sophia, Mathe, Johan, Papillon, Mathilde, Buracas, Domas, Lillemark, Hansen J, Shewmake, Christian, Bertics, Abby, Pennec, Xavier, Miolane, Nina
The enduring legacy of Euclidean geometry underpins classical machine learning, which, for decades, has been primarily developed for data lying in Euclidean space. Yet, modern machine learning increasingly encounters richly structured data that is inherently nonEuclidean. This data can exhibit intricate geometric, topological and algebraic structure: from the geometry of the curvature of space-time, to topologically complex interactions between neurons in the brain, to the algebraic transformations describing symmetries of physical systems. Extracting knowledge from such non-Euclidean data necessitates a broader mathematical perspective. Echoing the 19th-century revolutions that gave rise to non-Euclidean geometry, an emerging line of research is redefining modern machine learning with non-Euclidean structures. Its goal: generalizing classical methods to unconventional data types with geometry, topology, and algebra. In this review, we provide an accessible gateway to this fast-growing field and propose a graphical taxonomy that integrates recent advances into an intuitive unified framework. We subsequently extract insights into current challenges and highlight exciting opportunities for future development in this field.
The Selective G-Bispectrum and its Inversion: Applications to G-Invariant Networks
Mataigne, Simon, Mathe, Johan, Sanborn, Sophia, Hillar, Christopher, Miolane, Nina
An important problem in signal processing and deep learning is to achieve \textit{invariance} to nuisance factors not relevant for the task. Since many of these factors are describable as the action of a group $G$ (e.g. rotations, translations, scalings), we want methods to be $G$-invariant. The $G$-Bispectrum extracts every characteristic of a given signal up to group action: for example, the shape of an object in an image, but not its orientation. Consequently, the $G$-Bispectrum has been incorporated into deep neural network architectures as a computational primitive for $G$-invariance\textemdash akin to a pooling mechanism, but with greater selectivity and robustness. However, the computational cost of the $G$-Bispectrum ($\mathcal{O}(|G|^2)$, with $|G|$ the size of the group) has limited its widespread adoption. Here, we show that the $G$-Bispectrum computation contains redundancies that can be reduced into a \textit{selective $G$-Bispectrum} with $\mathcal{O}(|G|)$ complexity. We prove desirable mathematical properties of the selective $G$-Bispectrum and demonstrate how its integration in neural networks enhances accuracy and robustness compared to traditional approaches, while enjoying considerable speeds-up compared to the full $G$-Bispectrum.
ICML 2023 Topological Deep Learning Challenge : Design and Results
Papillon, Mathilde, Hajij, Mustafa, Jenne, Helen, Mathe, Johan, Myers, Audun, Papamarkou, Theodore, Birdal, Tolga, Dey, Tamal, Doster, Tim, Emerson, Tegan, Gopalakrishnan, Gurusankar, Govil, Devendra, Guzmรกn-Sรกenz, Aldo, Kvinge, Henry, Livesay, Neal, Mukherjee, Soham, Samaga, Shreyas N., Ramamurthy, Karthikeyan Natesan, Karri, Maneel Reddy, Rosen, Paul, Sanborn, Sophia, Walters, Robin, Agerberg, Jens, Barikbin, Sadrodin, Battiloro, Claudio, Bazhenov, Gleb, Bernardez, Guillermo, Brent, Aiden, Escalera, Sergio, Fiorellino, Simone, Gavrilev, Dmitrii, Hassanin, Mohammed, Hรคusner, Paul, Gardaa, Odin Hoff, Khamis, Abdelwahed, Lecha, Manuel, Magai, German, Malygina, Tatiana, Ballester, Rubรฉn, Nadimpalli, Kalyan, Nikitin, Alexander, Rabinowitz, Abraham, Salatiello, Alessandro, Scardapane, Simone, Scofano, Luca, Singh, Suraj, Sjรถlund, Jens, Snopov, Pavel, Spinelli, Indro, Telyatnikov, Lev, Testa, Lucia, Yang, Maosheng, Yue, Yixiao, Zaghen, Olga, Zia, Ali, Miolane, Nina
This paper presents the computational challenge on topological deep learning that was hosted within the ICML 2023 Workshop on Topology and Geometry in Machine Learning. The competition asked participants to provide open-source implementations of topological neural networks from the literature by contributing to the python packages TopoNetX (data processing) and TopoModelX (deep learning). The challenge attracted twenty-eight qualifying submissions in its two-month duration. This paper describes the design of the challenge and summarizes its main findings.
PVNet: A LRCN Architecture for Spatio-Temporal Photovoltaic PowerForecasting from Numerical Weather Prediction
Mathe, Johan, Miolane, Nina, Sebastien, Nicolas, Lequeux, Jeremie
Photovoltaic (PV) power generation has emerged as one of the lead renewable energy sources. Yet, its production is characterized by high uncertainty, being dependent on weather conditions like solar irradiance and temperature. Predicting PV production, even in the 24 hour forecast, remains a challenge and leads energy providers to keep idle - often carbon emitting - plants. In this paper we introduce a Long-Term Recurrent Convolutional Network using Numerical Weather Predictions (NWP) to predict, in turn, PV production in the 24 hour and 48 hour forecast horizons. This network architecture fully leverages both temporal and spatial weather data, sampled over the whole geographical area of interest. We train our model on a NWP dataset from the National Oceanic and Atmospheric Administration (NOAA) to predict spatially aggregated PV production in Germany. We compare its performance to the persistence model and to state-of-the-art methods.
geomstats: a Python Package for Riemannian Geometry in Machine Learning
Miolane, Nina, Mathe, Johan, Donnat, Claire, Jorda, Mikael, Pennec, Xavier
We introduce geomstats, a python package that performs computations on manifolds such as hyperspheres, hyperbolic spaces, spaces of symmetric positive definite matrices and Lie groups of transformations. We provide efficient and extensively unit-tested implementations of these manifolds, together with useful Riemannian metrics and associated Exponential and Logarithm maps. The corresponding geodesic distances provide a range of intuitive choices of Machine Learning's loss functions. We also give the corresponding Riemannian gradients. The operations implemented in geomstats are available with different computing backends such as numpy, tensorflow and keras. We have enabled GPU implementation and integrated geomstats' manifold computations into keras' deep learning framework. This paper also presents a review of manifolds in machine learning and an overview of the geomstats package with examples demonstrating its use for efficient and user-friendly Riemannian geometry.