Goto

Collaborating Authors

 Schober, Michael


Fast and Robust Shortest Paths on Manifolds Learned from Data

arXiv.org Machine Learning

A longstanding goal in machine learning is to build models that are invariant to irrelevant transformations of the data, as this can remove factors that are otherwise arbitrarily determined. For instance, in nonlinear latent variable models, the latent variables are generally unidentifiable as the latent space is by design not invariant to reparametrizations. Enforcing a Riemannian metric in the latent space that is invariant to reparametrizations alleviate this identifiability issue,which significantly boosts model performance and interpretability [Arvanitidis et al., 2018, Tosi et al., 2014]. Irrelevant transformations of the data can alternatively be factored out by only modeling local behavior of the data; geometrically this can be viewed as having a locally adaptive inner product 1 1.4 1.2 1 0.8 0.6 0.4 0.2 0


A probabilistic model for the numerical solution of initial value problems

arXiv.org Machine Learning

Like many numerical methods, solvers for initial value problems (IVPs) on ordinary differential equations estimate an analytically intractable quantity, using the results of tractable computations as inputs. This structure is closely connected to the notion of inference on latent variables in statistics. We describe a class of algorithms that formulate the solution to an IVP as inference on a latent path that is a draw from a Gaussian process probability measure (or equivalently, the solution of a linear stochastic differential equation). We then show that certain members of this class are connected precisely to generalized linear methods for ODEs, a number of Runge--Kutta methods, and Nordsieck methods. This probabilistic formulation of classic methods is valuable in two ways: analytically, it highlights implicit prior assumptions favoring certain approximate solutions to the IVP over others, and gives a precise meaning to the old observation that these methods act like filters. Practically, it endows the classic solvers with `docking points' for notions of uncertainty and prior information about the initial value, the value of the ODE itself, and the solution of the problem.


Probabilistic ODE Solvers with Runge-Kutta Means

Neural Information Processing Systems

Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.


Probabilistic ODE Solvers with Runge-Kutta Means

arXiv.org Machine Learning

Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.