Goto

Collaborating Authors

 spline approximation


Nonparametric Automatic Differentiation Variational Inference with Spline Approximation

arXiv.org Machine Learning

Variational Inference (VI) is widely used in data representation (Kingma and Welling, 2013; Zhang et al., 2018), graphical models (Wainwright et al., 2008), among others. VI approximates intractable distributions by minimizing the divergence between the true posterior and a chosen distribution family, aiming to identify an optimal distribution within this family. Unlike methods like Markov chain Monte Carlo (MCMC) sampling, VI is recognized for its computational efficiency and explicit distribution form (Blei et al., 2017). Contemporary VI-based methods such as variational autoencoder (VAE) (Kingma and Welling, 2013) have garnered interest for learning representations of complex, high-dimensional data across fields like bioinformatics (Kopf et al., 2021), geoscience (Chen et al., 2022), and finance (Bergeron et al., 2022). Automatic Differentiation Variational Inference (ADVI) (Kucukelbir et al., 2017) is a popular approach to derive variational inference algorithms for complex probabilistic models.


A max-affine spline approximation of neural networks using the Legendre transform of a convex-concave representation

arXiv.org Artificial Intelligence

This work presents a novel algorithm for transforming a neural network into a spline representation. Unlike previous work that required convex and piecewise-affine network operators to create a max-affine spline alternate form, this work relaxes this constraint. The only constraint is that the function be bounded and possess a well-define second derivative, although this was shown experimentally to not be strictly necessary. It can also be performed over the whole network rather than on each layer independently. As in previous work, this bridges the gap between neural networks and approximation theory but also enables the visualisation of network feature maps. Mathematical proof and experimental investigation of the technique is performed with approximation error and feature maps being extracted from a range of architectures, including convolutional neural networks.


$\mathcal{C}^k$-continuous Spline Approximation with TensorFlow Gradient Descent Optimizers

arXiv.org Artificial Intelligence

In this work we present an "out-of-the-box" application of Machine Learning (ML) optimizers for an industrial optimization problem. We introduce a piecewise polynomial model (spline) for fitting of $\mathcal{C}^k$-continuos functions, which can be deployed in a cam approximation setting. We then use the gradient descent optimization context provided by the machine learning framework TensorFlow to optimize the model parameters with respect to approximation quality and $\mathcal{C}^k$-continuity and evaluate available optimizers. Our experiments show that the problem solution is feasible using TensorFlow gradient tapes and that AMSGrad and SGD show the best results among available TensorFlow optimizers. Furthermore, we introduce a novel regularization approach to improve SGD convergence. Although experiments show that remaining discontinuities after optimization are small, we can eliminate these errors using a presented algorithm which has impact only on affected derivatives in the local spline segment.


Propagating Uncertainty through the tanh Function with Application to Reservoir Computing

arXiv.org Machine Learning

Many neural networks use the tanh activation function, however when given a probability distribution as input, the problem of computing the output distribution in neural networks with tanh activation has not yet been addressed. One important example is the initialization of the echo state network in reservoir computing, where random initialization of the reservoir requires time to wash out the initial conditions, thereby wasting precious data and computational resources. Motivated by this problem, we propose a novel solution utilizing a moment based approach to propagate uncertainty through an Echo State Network to reduce the washout time. In this work, we contribute two new methods to propagate uncertainty through the tanh activation function and propose the Probabilistic Echo State Network (PESN), a method that is shown to have better average performance than deterministic Echo State Networks given the random initialization of reservoir states. Additionally we test single and multi-step uncertainty propagation of our method on two regression tasks and show that we are able to recover similar means and variances as computed by Monte-Carlo simulations.


Splines, Rational Functions and Neural Networks

Neural Information Processing Systems

Connections between spline approximation, approximation with rational functions, and feedforward neural networks are studied. The potential improvement in the degree of approximation in going from single to two hidden layer networks is examined. Some results of Birman and Solomjak regarding the degree of approximation achievable when knot positions are chosen on the basis of the probability distribution of examples rather than the function values are extended.


Splines, Rational Functions and Neural Networks

Neural Information Processing Systems

Connections between spline approximation, approximation with rational functions, and feedforward neural networks are studied. The potential improvement in the degree of approximation in going from single to two hidden layer networks is examined. Some results of Birman and Solomjak regarding the degree of approximation achievable when knot positions are chosen on the basis of the probability distribution of examples rather than the function values are extended.


Splines, Rational Functions and Neural Networks

Neural Information Processing Systems

Connections between spline approximation, approximation with rational functions, and feedforward neural networks are studied. The potential improvement in the degree of approximation in going from single to two hidden layer networks is examined. Some results of Birman and Solomjak regarding the degree of approximation achievable when knot positions are chosen on the basis of the probability distribution of examples rather than the function values are extended.