AITopics

2503.04763

Country: Africa > Rwanda (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Machine LearningJan-10-2025

Random Sparse Lifts: Construction, Analysis and Convergence of finite sparse networks

Robin, David A. R., Scaman, Kevin, Lelarge, Marc

We present a framework to define a large class of neural networks for which, by construction, training by gradient flow provably reaches arbitrarily low loss when the number of parameters grows. Distinct from the fixed-space global optimality of non-convex optimization, this new form of convergence, and the techniques introduced to prove such convergence, pave the way for a usable deep learning convergence theory in the near future, without overparameterization assumptions relating the number of parameters and training samples. We define these architectures from a simple computation graph and a mechanism to lift it, thus increasing the number of parameters, generalizing the idea of increasing the widths of multi-layer perceptrons. We show that architectures similar to most common deep learning models are present in this class, obtained by sparsifying the weight tensors of usual architectures at initialization. Leveraging tools of algebraic topology and random graph theory, we use the computation graph's geometry to propagate properties guaranteeing convergence to any precision for these large sparse models.

artificial intelligence, deep learning, machine learning, (16 more...)

2501.0593

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

arXiv.org Artificial IntelligenceJun-21-2024

Neural Incremental Data Assimilation

Blanke, Matthieu, Fablet, Ronan, Lelarge, Marc

Data assimilation is a central problem in many geophysical applications, such as weather forecasting. It aims to estimate the state of a potentially large system, such as the atmosphere, from sparse observations, supplemented by prior physical knowledge. The size of the systems involved and the complexity of the underlying physical equations make it a challenging task from a computational point of view. Neural networks represent a promising method of emulating the physics at low cost, and therefore have the potential to considerably improve and accelerate data assimilation. In this work, we introduce a deep learning approach where the physical system is modeled as a sequence of coarse-to-fine Gaussian prior distributions parametrized by a neural network. This allows us to define an assimilation operator, which is trained in an end-to-end fashion to minimize the reconstruction error on a dataset with different observation processes. We illustrate our approach on chaotic dynamical physical systems with sparse observations, and compare it to traditional variational data assimilation methods.

artificial intelligence, deep learning, machine learning, (15 more...)

2406.15076

Country:

Europe > France (0.14)
Europe > Austria (0.14)

Genre:

Research Report (0.84)
Instructional Material > Course Syllabus & Notes (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-15-2023

Automatic Rao-Blackwellization for Sequential Monte Carlo with Belief Propagation

Azizian, Waïss, Baudart, Guillaume, Lelarge, Marc

Exact Bayesian inference on state-space models~(SSM) is in general untractable, and unfortunately, basic Sequential Monte Carlo~(SMC) methods do not yield correct approximations for complex models. In this paper, we propose a mixed inference algorithm that computes closed-form solutions using belief propagation as much as possible, and falls back to sampling-based SMC methods when exact computations fail. This algorithm thus implements automatic Rao-Blackwellization and is even exact for Gaussian tree models.

algorithm, artificial intelligence, bayesian inference, (14 more...)

2312.0986

Country: Europe > France (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

arXiv.org Machine LearningDec-1-2023

Interpretable Meta-Learning of Physical Systems

Blanke, Matthieu, Lelarge, Marc

Machine learning methods can be a valuable aid in the scientific process, but they need to face challenging settings where data come from inhomogeneous experimental conditions. Recent meta-learning methods have made significant progress in multi-task learning, but they rely on black-box neural networks, resulting in high computational costs and limited interpretability. Leveraging the structure of the learning problem, we argue that multi-environment generalization can be achieved using a simpler learning model, with an affine structure with respect to the learning task. Crucially, we prove that this architecture can identify the physical parameters of the system, enabling interpreable learning. We demonstrate the competitive generalization performance and the low computational cost of our method by comparing it to state-of-the-art algorithms on physical systems, ranging from toy models to complex, non-analytical systems. The interpretability of our method is illustrated with original applications to physical-parameter-induced adaptation and to adaptive control. Learning physical systems is an essential application of artificial intelligence that can unlock significant technological and societal progress. Physical systems are inherently complex, making them difficult to learn Karniadakis et al. (2021). A particularly challenging and common scenario is multienvironment learning, where observations of a physical system are collected under inhomogeneous experimental conditions Caruana (1997). In such cases, the scarcity of training data necessitates the development of robust learning algorithms that can efficiently handle environmental changes and make use of all available data.

artificial intelligence, machine learning, survey article, (20 more...)

2312.00477

Country: North America > United States (0.14)

Genre:

Research Report (0.50)
Overview (0.34)

Industry: Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceApr-26-2023

FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems

Blanke, Matthieu, Lelarge, Marc

Model-based reinforcement learning is a powerful tool, but collecting data to fit an accurate model of the system can be costly. Exploring an unknown environment in a sample-efficient manner is hence of great importance. However, the complexity of dynamics and the computational limitations of real systems make this task challenging. In this work, we introduce FLEX, an exploration algorithm for nonlinear dynamics based on optimal experimental design. Our policy maximizes the information of the next step and results in an adaptive exploration algorithm, compatible with generic parametric learning models and requiring minimal resources. We test our method on a number of nonlinear environments covering different settings, including time-varying dynamics. Keeping in mind that exploration is intended to serve an exploitation objective, we also test our algorithm on downstream model-based classical control tasks and compare it to other state-of-the-art model-based and model-free approaches. The performance achieved by FLEX is competitive and its computational cost is low.

artificial intelligence, exploration, machine learning, (14 more...)

2304.13426

Genre: Research Report > New Finding (0.68)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJan-19-2023

Convergence beyond the over-parameterized regime using Rayleigh quotients

Robin, David A. R., Scaman, Kevin, Lelarge, Marc

In this paper, we present a new strategy to prove the convergence of deep learning architectures to a zero training (or even testing) loss by gradient flow. Our analysis is centered on the notion of Rayleigh quotients in order to prove Kurdyka-{\L}ojasiewicz inequalities for a broader set of neural network architectures and loss functions. We show that Rayleigh quotients provide a unified view for several convergence analysis techniques in the literature. Our strategy produces a proof of convergence for various examples of parametric learning. In particular, our analysis does not require the number of parameters to tend to infinity, nor the number of samples to be finite, thus extending to test loss minimization and beyond the over-parameterized regime.

artificial intelligence, inequality, machine learning, (18 more...)

2301.08117

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Artificial IntelligenceApr-13-2022

Online greedy identification of linear dynamical systems

Blanke, Matthieu, Lelarge, Marc

This work addresses the problem of exploration in an unknown environment. For linear dynamical systems, we use an experimental design framework and introduce an online greedy policy where the control maximizes the information of the next step. In a setting with a limited number of experimental trials, our algorithm has low complexity and shows experimentally competitive performances compared to more elaborate gradient-based methods.

artificial intelligence, identification, machine learning, (15 more...)

doi: 10.1109/CDC51059.2022.9993030

2204.06375

Country: Europe (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

arXiv.org Machine LearningJul-15-2021

Correlation detection in trees for partial graph alignment

Ganassali, Luca, Massoulié, Laurent, Lelarge, Marc

We consider alignment of sparse graphs, which consists in finding a mapping between the nodes of two graphs which preserves most of the edges. Our approach is to compare local structures in the two graphs, matching two nodes if their neighborhoods are 'close enough': for correlated Erd\H{o}s-R\'enyi random graphs, this problem can be locally rephrased in terms of testing whether a pair of branching trees is drawn from either a product distribution, or a correlated distribution. We design an optimal test for this problem which gives rise to a message-passing algorithm for graph alignment, which provably returns in polynomial time a positive fraction of correctly matched vertices, and a vanishing fraction of mismatches. With an average degree $\lambda = O(1)$ in the graphs, and a correlation parameter $s \in [0,1]$, this result holds with $\lambda s$ large enough, and $1-s$ small enough, completing the recent state-of-the-art diagram. Tighter conditions for determining whether partial graph alignment (or correlation detection in trees) is feasible in polynomial time are given in terms of Kullback-Leibler divergences.

alignment, artificial intelligence, graph, (18 more...)

2107.07623

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

arXiv.org Machine LearningFeb-4-2021

Impossibility of Partial Recovery in the Graph Alignment Problem

Ganassali, Luca, Massoulié, Laurent, Lelarge, Marc

Random graph alignment refers to recovering the underlying vertex correspondence between two random graphs with correlated edges. This can be viewed as an average-case and noisy version of the well-known NP-hard graph isomorphism problem. For the correlated Erd\"os-R\'enyi model, we prove an impossibility result for partial recovery in the sparse regime, with constant average degree and correlation, as well as a general bound on the maximal reachable overlap. Our bound is tight in the noiseless case (the graph isomorphism problem) and we conjecture that it is still tight with noise. Our proof technique relies on a careful application of the probabilistic method to build automorphisms between tree components of a subcritical Erd\"os-R\'enyi graph.

artificial intelligence, graph, machine learning, (18 more...)

2102.02685

Country: Europe > France (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.66)