Goto

Collaborating Authors

 Donier, Jonathan


Enabling Uncertainty Estimation in Iterative Neural Networks

arXiv.org Artificial Intelligence

Turning pass-through network architectures into iterative ones, which use their own output as input, is a well-known approach for boosting performance. In this paper, we argue that such architectures offer an additional benefit: The convergence rate of their successive outputs is highly correlated with the accuracy of the value to which they converge. Thus, we can use the convergence rate as a useful proxy for uncertainty. This results in an approach to uncertainty estimation that provides state-of-the-art estimates at a much lower computational cost than techniques like Ensembles, and without requiring any modifications to the original iterative model. We demonstrate its practical value by embedding it in two application domains: road detection in aerial images and the estimation of aerodynamic properties of 2D and 3D shapes.


DEBOSH: Deep Bayesian Shape Optimization

arXiv.org Artificial Intelligence

Shape optimization is at the heart of many industrial applications, such as aerodynamics, heat transfer, and structural analysis. It has recently been shown that Graph Neural Networks (GNNs) can predict the performance of a shape quickly and accurately and be used to optimize more effectively than traditional techniques that rely on response-surfaces obtained by Kriging. However, GNNs suffer from the fact that they do not evaluate their own accuracy, which is something Bayesian Optimization methods require. Therefore, estimating confidence in generated predictions is necessary to go beyond straight deterministic optimization, which is less effective. In this paper, we demonstrate that we can use Ensembles-based technique to overcome this limitation and outperform the state-of-the-art. Our experiments on diverse aerodynamics and structural analysis tasks prove that adding uncertainty to shape optimization significantly improves the quality of resulting shapes and reduces the time required for the optimization.


HybridSDF: Combining Free Form Shapes and Geometric Primitives for effective Shape Manipulation

arXiv.org Artificial Intelligence

CAD modeling typically involves the use of simple geometric primitives whereas recent advances in deep-learning based 3D surface modeling have opened new shape design avenues. Unfortunately, these advances have not yet been accepted by the CAD community because they cannot be integrated into engineering workflows. To remedy this, we propose a novel approach to effectively combining geometric primitives and free-form surfaces represented by implicit surfaces for accurate modeling that preserves interpretability, enforces consistency, and enables easy manipulation.


Scaling up deep neural networks: a capacity allocation perspective

arXiv.org Machine Learning

Capacity analysis has been introduced in [2] as a way to analyze which dependencies a linear model is focussing its modelling capacity on, when trained on a given task. The concept was then extended in [3] to neural networks with nonlinear activations, where capacity propagation through layers was studied. When the layers are residual (or differential), and in one limiting case with extremely irregular activations (which was called the pseudo-random limit), it has been shown that capacity propagation through layers follows a discrete Markov equation. This discrete equation can then be approximated by a continuous Kolmogorov forward equation in the deep limit, provided some specific scaling relation holds between the network depth and the scale of its residual connections - more precisely, the residual weights must scale as the inverse square root of the number of layers. Following [1], it was then hypothesized that the success of residual networks lies in their ability to propagate capacity through a large number of layers in a non-degenerate manner. It is interesting to note that the inverse square root scaling mentioned above is the only scaling relation that leads to a non-degenerate propagation PDE in that case: larger weights would lead to shattering, while smaller ones would lead to no spatial propagation at all. In this paper, we take this idea one step further and formulate the conjecture that enforcing the right scaling relations - i.e. the ones that lead to a non-degenerate continuous limit for capacity propagation - is key to avoiding the shattering problem: we call this the neural network scaling conjecture. In the example above, this would mean that the inverse square root scaling must be enforced if one wants to use residual networks at their full power. In the second part of this paper, we use the PDE capacity propagation framework to study a number of commonly used network architectures, and determine the scaling relations that are required for a non-degenerate capacity propagation to happen in each case.


Capacity allocation through neural network layers

arXiv.org Machine Learning

Capacity analysis has been recently introduced as a way to analyze how linear models distribute their modelling capacity across the input space. In this paper, we extend the notion of capacity allocation to the case of neural networks with non-linear layers. We show that under some hypotheses the problem is equivalent to linear capacity allocation, within some extended input space that factors in the non-linearities. We introduce the notion of layer decoupling, which quantifies the degree to which a non-linear activation decouples its outputs, and show that it plays a central role in capacity allocation through layers. In the highly non-linear limit where decoupling is total, we show that the propagation of capacity throughout the layers follows a simple markovian rule, which turns into a diffusion PDE in the limit of deep networks with residual layers. This allows us to recover some known results about deep neural networks, such as the size of the effective receptive field, or why ResNets avoid the shattering problem.


Capacity allocation analysis of neural networks: A tool for principled architecture design

arXiv.org Machine Learning

Since the popularization of deep neural networks in the early 2010s, tailoring neural network architectures to specific tasks has been one of the main sources of activity for both academics and practitioners. Accordingly, a palette of empirical methods has been developed for automating the choice of neural networks hyperparameters (a process sometimes called Neural Architecture Search), including - but not limited to - random search [2, 1], genetic algorithms [16, 13], bayesian methods [24, 12] or reinforcement learning [29]. However, when the computational requirements for training a single model are high, such approaches might be too expensive or result in iteration cycles that are too long to be practically useful - though some work in that direction has been carried out recently [5, 14]. In other cases, when the loss function is only used as a proxy for the task at hand [25, 26, 10] or is not interpretable [8], a further perceptual evaluation is typically necessary to evaluate the quality of a model's outputs and such systematic approaches at least partially break down. In both cases, an efficient and quantitative method to analyze and compare neural network architectures would be highly desirable - be it only to come up with a limited set of plausible candidates to pass on to the more expensive (or manual) methods. In this paper, we introduce the notion of capacity allocation analysis, which is a systematic, quantitative andcomputationally efficient way to analyze neural network architectures by quantifying which dependencies between inputs and outputs a parameter of a set of parameters actually model. We develop a quantitative framework for assessing and comparing different architectures for a given task, providing insights that are complementary to the value of the loss function itself.