AITopics | Reisinger, Christoph

Plotting

Reisinger, Christoph

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Learning for Entropy-Regularized Markov Decision Processes via Multilevel Monte Carlo

Meunier, Matthieu, Reisinger, Christoph, Zhang, Yufei

arXiv.org Machine LearningMar-31-2025

Designing efficient learning algorithms with complexity guarantees for Markov decision processes (MDPs) with large or continuous state and action spaces remains a fundamental challenge. We address this challenge for entropy-regularized MDPs with Polish state and action spaces, assuming access to a generative model of the environment. We propose a novel family of multilevel Monte Carlo (MLMC) algorithms that integrate fixed-point iteration with MLMC techniques and a generic stochastic approximation of the Bellman operator. We quantify the precise impact of the chosen approximate Bellman operator on the accuracy of the resulting MLMC estimator. Leveraging this error analysis, we show that using a biased plain MC estimate for the Bellman operator results in quasi-polynomial sample complexity, whereas an unbiased randomized multilevel approximation of the Bellman operator achieves polynomial sample complexity in expectation. Notably, these complexity bounds are independent of the dimensions or cardinalities of the state and action spaces, distinguishing our approach from existing algorithms whose complexities scale with the sizes of these spaces. We validate these theoretical performance guarantees through numerical experiments.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2503.21224

Country: Europe > United Kingdom (0.45)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Add feedback

$K$-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control

Giegrich, Michael, Oomen, Roel, Reisinger, Christoph

arXiv.org Machine LearningJan-10-2024

In reinforcement learning (RL), off-policy evaluation (OPE) deals with the problem of estimating the value of a target policy with observations generated from a different behavior policy. OPE methods are typically applied to sequential decision making problems where observational data is available but experimentation with the environment directly is not possible or costly. More broadly, OPE methods are a widely researched subject in RL (see, e.g., [60, 23, 58] for recent overviews), however, relatively little attention has been paid to stochastic environments where the stochasticity depends on the chosen actions and state and action spaces are continuous. For example, common benchmark problems are either deterministic or have finite state and/or action spaces (see, e.g., [60, 23]). Notwithstanding this, stochastic control problems are precisely concerned with the setting where a decision process affects random transitions. Stochastic control is a field closely related to reinforcement learning and its methods have been applied to a wide range of high-stakes decision-making problems in diverse fields such as operations research [24, 41], economics [31, 29], electrical engineering [44, 17], autonomous driving [62] and finance [15, 55]. In the stochastic control literature, optimal policies are often represented as deterministic feedback policies (i.e., as deterministic functions of the current state) and, in the episodic case, are non-stationary due to the impact of a finite time-horizon. Stochastic control environments pose a challenging setting for OPE methods. For example, classical methods like importance sampling (IS) [50] struggle with deterministic target policies in continuous action spaces due to the severe policy mismatch between the target and the behavior policy (see, e.g.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2306.04836

Country: Europe > United Kingdom > England (0.27)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.67)
Information Technology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Convergence of policy gradient methods for finite-horizon stochastic linear-quadratic control problems

Giegrich, Michael, Reisinger, Christoph, Zhang, Yufei

arXiv.org Artificial IntelligenceOct-19-2023

We study the global linear convergence of policy gradient (PG) methods for finite-horizon continuous-time exploratory linear-quadratic control (LQC) problems. The setting includes stochastic LQC problems with indefinite costs and allows additional entropy regularisers in the objective. We consider a continuous-time Gaussian policy whose mean is linear in the state variable and whose covariance is state-independent. Contrary to discrete-time problems, the cost is noncoercive in the policy and not all descent directions lead to bounded iterates. We propose geometry-aware gradient descents for the mean and covariance of the policy using the Fisher geometry and the Bures-Wasserstein geometry, respectively. The policy iterates are shown to satisfy an a-priori bound, and converge globally to the optimal policy with a linear rate. We further propose a novel PG method with discrete-time policies. The algorithm leverages the continuous-time analysis, and achieves a robust linear convergence across different action frequencies. A numerical experiment confirms the convergence and robustness of the proposed algorithm.

artificial intelligence, machine learning, proposition 2, (18 more...)

arXiv.org Artificial Intelligence

2211.00617

Country: Europe > United Kingdom (0.27)

Genre: Research Report (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Linear convergence of a policy gradient method for some finite horizon continuous time control problems

Reisinger, Christoph, Stockinger, Wolfgang, Zhang, Yufei

arXiv.org Artificial IntelligenceDec-23-2022

Despite its popularity in the reinforcement learning community, a provably convergent policy gradient method for continuous space-time control problems with nonlinear state dynamics has been elusive. This paper proposes proximal gradient algorithms for feedback controls of finite-time horizon stochastic control problems. The state dynamics are nonlinear diffusions with control-affine drift, and the cost functions are nonconvex in the state and nonsmooth in the control. The system noise can degenerate, which allows for deterministic control problems as special cases. We prove under suitable conditions that the algorithm converges linearly to a stationary point of the control problem, and is stable with respect to policy updates by approximate gradient steps. The convergence result justifies the recent reinforcement learning heuristics that adding entropy regularization or a fictitious discount factor to the optimization objective accelerates the convergence of policy gradient methods.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2203.11758

Country: Europe > United Kingdom (0.28)

Genre: Research Report (0.40)

Industry: Government > Regional Government (0.47)

Technology:

Information Technology > Mathematics of Computing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Arbitrage-free neural-SDE market models

Cohen, Samuel N., Reisinger, Christoph, Wang, Sheng

arXiv.org Machine LearningMay-23-2021

Modelling joint dynamics of liquid vanilla options is crucial for arbitrage-free pricing of illiquid derivatives and managing risks of option trade books. This paper develops a nonparametric model for the European options book respecting underlying financial constraints and while being practically implementable. We derive a state space for prices which are free from static (or model-independent) arbitrage and study the inference problem where a model is learnt from discrete time series data of stock and option prices. We use neural networks as function approximators for the drift and diffusion of the modelled SDE system, and impose constraints on the neural nets such that no-arbitrage conditions are preserved. In particular, we give methods to calibrate \textit{neural SDE} models which are guaranteed to satisfy a set of linear inequalities. We validate our approach with numerical experiments using data generated from a Heston stochastic local volatility model.

banking & finance, constraint, neural network, (17 more...)

arXiv.org Machine Learning

2105.11053

Country: Europe (0.14)

Genre: Research Report (1.00)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Understanding Deep Architectures with Reasoning Layer

Chen, Xinshi, Zhang, Yufei, Reisinger, Christoph, Song, Le

arXiv.org Machine LearningOct-29-2020

Recently, there has been a surge of interest in combining deep learning models with reasoning in order to handle more sophisticated learning tasks. In many cases, a reasoning task can be solved by an iterative algorithm. This algorithm is often unrolled, and used as a specialized layer in the deep architecture, which can be trained end-to-end with other neural components. Although such hybrid deep architectures have led to many empirical successes, the theoretical foundation of such architectures, especially the interplay between algorithm layers and other neural layers, remains largely unexplored. In this paper, we take an initial step towards an understanding of such hybrid deep architectures by showing that properties of the algorithm layers, such as convergence, stability, and sensitivity, are intimately related to the approximation and generalization abilities of the end-to-end model. Furthermore, our analysis matches closely our experimental observations under various conditions, suggesting that our theory can provide useful guidelines for designing deep architectures with reasoning layers.

artificial intelligence, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

2006.13401

Genre: Research Report (0.63)

Industry: Education (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback