AITopics | Nüsken, Nikolas

Plotting

Nüsken, Nikolas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Conditioning Diffusions Using Malliavin Calculus

Pidstrigach, Jakiw, Baker, Elizabeth, Domingo-Enrich, Carles, Deligiannidis, George, Nüsken, Nikolas

arXiv.org Machine LearningApr-4-2025

In stochastic optimal control and conditional generative modelling, a central computational task is to modify a reference diffusion process to maximise a given terminal-time reward. Most existing methods require this reward to be differentiable, using gradients to steer the diffusion towards favourable outcomes. However, in many practical settings, like diffusion bridges, the reward is singular, taking an infinite value if the target is hit and zero otherwise. We introduce a novel framework, based on Malliavin calculus and path-space integration by parts, that enables the development of methods robust to such singular rewards. This allows our approach to handle a broad range of applications, including classification, diffusion bridges, and conditioning without the need for artificial observational noise. We demonstrate that our approach offers stable and reliable training, outperforming existing techniques.

artificial intelligence, conditioning diffusion, machine learning, (13 more...)

arXiv.org Machine Learning

2504.03461

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

Transport meets Variational Inference: Controlled Monte Carlo Diffusions

Vargas, Francisco, Padhy, Shreyas, Blessing, Denis, Nüsken, Nikolas

arXiv.org Machine LearningNov-7-2023

Connecting optimal transport and variational inference, we present a principled and systematic framework for sampling and generative modelling centred around divergences on path space. Our work culminates in the development of the \emph{Controlled Monte Carlo Diffusion} sampler (CMCD) for Bayesian computation, a score-based annealing technique that crucially adapts both forward and backward dynamics in a diffusion model. On the way, we clarify the relationship between the EM-algorithm and iterative proportional fitting (IPF) for Schr{\"o}dinger bridges, deriving as well a regularised objective that bypasses the iterative bottleneck of standard IPF-updates. Finally, we show that CMCD has a strong foundation in the Jarzinsky and Crooks identities from statistical physics, and that it convincingly outperforms competing approaches across a wide array of experiments.

artificial intelligence, machine learning, proposition 2, (16 more...)

arXiv.org Machine Learning

2307.0105

Country: Europe (0.67)

Genre: Research Report (0.63)

Add feedback

From continuous-time formulations to discretization schemes: tensor trains and robust regression for BSDEs and parabolic PDEs

Richter, Lorenz, Sallandt, Leon, Nüsken, Nikolas

arXiv.org Artificial IntelligenceJul-28-2023

The numerical approximation of partial differential equations (PDEs) poses formidable challenges in high dimensions since classical grid-based methods suffer from the so-called curse of dimensionality. Recent attempts rely on a combination of Monte Carlo methods and variational formulations, using neural networks for function approximation. Extending previous work (Richter et al., 2021), we argue that tensor trains provide an appealing framework for parabolic PDEs: The combination of reformulations in terms of backward stochastic differential equations and regression-type methods holds the promise of leveraging latent low-rank structures, enabling both compression and efficient computation. Emphasizing a continuous-time viewpoint, we develop iterative schemes, which differ in terms of computational efficiency and robustness. We demonstrate both theoretically and numerically that our methods can achieve a favorable trade-off between accuracy and computational efficiency. While previous methods have been either accurate or fast, we have identified a novel numerical strategy that can often combine both of these aspects.

artificial intelligence, machine learning, survey article, (16 more...)

arXiv.org Artificial Intelligence

2307.15496

Genre: Research Report (1.00)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Bayesian Learning via Neural Schr\"odinger-F\"ollmer Flows

Vargas, Francisco, Ovsianas, Andrius, Fernandes, David, Girolami, Mark, Lawrence, Neil D., Nüsken, Nikolas

arXiv.org Machine LearningDec-7-2021

In this work we explore a new framework for approximate Bayesian inference in large datasets based on stochastic control. We advocate stochastic control as a finite time and low variance alternative to popular steady-state methods such as stochastic gradient Langevin dynamics (SGLD). Furthermore, we discuss and adapt the existing theoretical guarantees of this framework and establish connections to already existing VI routines in SDE-based models.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2111.1051

Country: Europe (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Interpolating between BSDEs and PINNs -- deep learning for elliptic and parabolic boundary value problems

Nüsken, Nikolas, Richter, Lorenz

arXiv.org Machine LearningDec-7-2021

Solving high-dimensional partial differential equations is a recurrent challenge in economics, science and engineering. In recent years, a great number of computational approaches have been developed, most of them relying on a combination of Monte Carlo sampling and deep learning based approximation. For elliptic and parabolic problems, existing methods can broadly be classified into those resting on reformulations in terms of $\textit{backward stochastic differential equations}$ (BSDEs) and those aiming to minimize a regression-type $L^2$-error ($\textit{physics-informed neural networks}$, PINNs). In this paper, we review the literature and suggest a methodology based on the novel $\textit{diffusion loss}$ that interpolates between BSDEs and PINNs. Our contribution opens the door towards a unified understanding of numerical approaches for high-dimensional PDEs, as well as for implementations that combine the strengths of BSDEs and PINNs. We also provide generalizations to eigenvalue problems and perform extensive numerical studies, including calculations of the ground state for nonlinear Schr\"odinger operators and committor functions relevant in molecular dynamics.

artificial intelligence, machine learning, survey article, (17 more...)

arXiv.org Machine Learning

2112.03749

Country:

Europe > Germany (0.28)
North America > United States (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stein Variational Gradient Descent: many-particle and long-time asymptotics

Nüsken, Nikolas, Renger, D. R. Michiel

arXiv.org Machine LearningFeb-25-2021

Stein variational gradient descent (SVGD) refers to a class of methods for Bayesian inference based on interacting particle systems. In this paper, we consider the originally proposed deterministic dynamics as well as a stochastic variant, each of which represent one of the two main paradigms in Bayesian computational statistics: variational inference and Markov chain Monte Carlo. As it turns out, these are tightly linked through a correspondence between gradient flow structures and large-deviation principles rooted in statistical physics. To expose this relationship, we develop the cotangent space construction for the Stein geometry, prove its basic properties, and determine the large-deviation functional governing the many-particle limit for the empirical measure. Moreover, we identify the Stein-Fisher information (or kernelised Stein discrepancy) as its leading order contribution in the long-time and many-particle regime in the sense of $\Gamma$-convergence, shedding some light on the finite-particle properties of SVGD. Finally, we establish a comparison principle between the Stein-Fisher information and RKHS-norms that might be of independent interest.

stein-fisher information, survey article, upstream oil & gas, (16 more...)

arXiv.org Machine Learning

2102.12956

Country:

Europe > Germany (0.46)
North America > United States > New York (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Energy > Oil & Gas > Upstream (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Solving high-dimensional parabolic PDEs using the tensor train format

Richter, Lorenz, Sallandt, Leon, Nüsken, Nikolas

arXiv.org Machine LearningFeb-23-2021

Many of the suggested High-dimensional partial differential equations algorithms perform remarkably well in practice and (PDEs) are ubiquitous in economics, science and some theoretical results proving beneficial approximation engineering. However, their numerical treatment properties of neural networks in the PDE setting are now poses formidable challenges since traditional gridbased available (Jentzen et al., 2018). Still, a complete picture methods tend to be frustrated by the curse of remains elusive, and the optimization aspect in particular dimensionality. In this paper, we argue that tensor continues to pose challenging and mostly open problems, trains provide an appealing approximation framework both in terms of efficient implementations and theoretical for parabolic PDEs: the combination of reformulations understanding. Most importantly for practical applications, in terms of backward stochastic differential neural network training using gradient descent type schemes equations and regression-type methods may often take a very long time to converge for complicated in the tensor format holds the promise of leveraging PDE problems.

deep learning, high-dimensional parabolic pde, neural network, (13 more...)

arXiv.org Machine Learning

2102.1183

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

VarGrad: A Low-Variance Gradient Estimator for Variational Inference

Richter, Lorenz, Boustati, Ayman, Nüsken, Nikolas, Ruiz, Francisco J. R., Akyildiz, Ömer Deniz

arXiv.org Machine LearningOct-29-2020

We analyse the properties of an unbiased gradient estimator of the ELBO for variational inference, based on the score function method with leave-one-out control variates. We show that this gradient estimator can be obtained using a new loss, defined as the variance of the log-ratio between the exact posterior and the variational approximation, which we call the $\textit{log-variance loss}$. Under certain conditions, the gradient of the log-variance loss equals the gradient of the (negative) ELBO. We show theoretically that this gradient estimator, which we call $\textit{VarGrad}$ due to its connection to the log-variance loss, exhibits lower variance than the score function method in certain settings, and that the leave-one-out control variate coefficients are close to the optimal ones. We empirically demonstrate that VarGrad offers a favourable variance versus computation trade-off compared to other state-of-the-art estimators on a discrete VAE.

artificial intelligence, estimator, neural network, (17 more...)

arXiv.org Machine Learning

2010.10436

Country: North America (0.28)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

Solving high-dimensional Hamilton-Jacobi-Bellman PDEs using neural networks: perspectives from the theory of controlled diffusions and measures on path space

Nüsken, Nikolas, Richter, Lorenz

arXiv.org Machine LearningMay-11-2020

Hamilton-Jacobi-Bellman partial differential equations (HJB-PDEs) are of central importance in applied mathematics. Rooted in reformulations of classical mechanics [45] in the nineteenth century, they nowadays form the backbone of (stochastic) optimal control theory [81, 115], having a profound impact on neighbouring fields such as optimal transportation [109, 110], mean field games [20], backward stochastic differential equations (BSDEs) [19] and large deviations [39]. Applications in science and engineering abound; examples include stochastic filtering and data assimilation [79, 95], the simulation of rare events in molecular dynamics [51, 54, 119], and nonconvex optimisation [24]. Many of these applications involve HJB-PDEs in high-dimensional or even infinite-dimensional state spaces, posing a formidable challenge for their numerical treatment and in particular rendering grid-based schemes infeasible. In recent years, approaches to approximating the solutions of high-dimensional elliptic and parabolic PDEs have been developed combining well-known Feynman-Kac formulae with machine learning methodologies, seeking scalability and robustness in high-dimensional and complex scenarios [50, 111]. Crucially, the use of artificial neural networks offers the promise of accurate and efficient function approximation which in conjunction with Monte Carlo methods can beat the curse of dimensionality, as investigated in [5, 25, 49, 60].

deep learning, neural network, problem 2, (18 more...)

arXiv.org Machine Learning

2005.05409

Country:

Europe > Germany (0.28)
North America > United States (0.27)

Genre: Research Report (1.00)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback