AITopics | Baumann, Dominik

Collaborating Authors

Baumann, Dominik

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Safe exploration in reproducing kernel Hilbert spaces

Tokmak, Abdullah, Krishnan, Kiran G., Schön, Thomas B., Baumann, Dominik

arXiv.org Artificial IntelligenceMar-13-2025

Popular safe Bayesian optimization (BO) algorithms learn control policies for safety-critical systems in unknown environments. However, most algorithms make a smoothness assumption, which is encoded by a known bounded norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it remains unclear how to reliably obtain the RKHS norm of an unknown function. In this work, we propose a safe BO algorithm capable of estimating the RKHS norm from data. We provide statistical guarantees on the RKHS norm estimation, integrate the estimated RKHS norm into existing confidence intervals and show that we retain theoretical guarantees, and prove safety of the resulting safe BO algorithm. We apply our algorithm to safely optimize reinforcement learning policies on physics simulators and on a real inverted pendulum, demonstrating improved performance, safety, and scalability compared to the state-of-the-art.

machine learning, reinforcement learning, rkh norm, (18 more...)

arXiv.org Artificial Intelligence

2503.10352

Country: Europe (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Robots (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Transfer Learning in Latent Contextual Bandits with Covariate Shift Through Causal Transportability

Deng, Mingwei, Kyrki, Ville, Baumann, Dominik

arXiv.org Artificial IntelligenceFeb-27-2025

Transferring knowledge from one environment to another is an essential ability of intelligent systems. Nevertheless, when two environments are different, naively transferring all knowledge may deteriorate the performance, a phenomenon known as negative transfer. In this paper, we address this issue within the framework of multi-armed bandits from the perspective of causal inference. Specifically, we consider transfer learning in latent contextual bandits, where the actual context is hidden, but a potentially high-dimensional proxy is observable. We further consider a covariate shift in the context across environments. We show that naively transferring all knowledge for classical bandit algorithms in this setting led to negative transfer. We then leverage transportability theory from causal inference to develop algorithms that explicitly transfer effective knowledge for estimating the causal effects of interest in the target environment. Besides, we utilize variational autoencoders to approximate causal effects under the presence of a high-dimensional proxy. We test our algorithms on synthetic and semi-synthetic datasets, empirically demonstrating consistently improved learning efficiency across different proxies compared to baseline algorithms, showing the effectiveness of our causal framework in transferring knowledge.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.20153

Country: Europe (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Simulation-Aided Policy Tuning for Black-Box Robot Learning

He, Shiming, von Rohr, Alexander, Baumann, Dominik, Xiang, Ji, Trimpe, Sebastian

arXiv.org Artificial IntelligenceNov-21-2024

How can robots learn and adapt to new tasks and situations with little data? Systematic exploration and simulation are crucial tools for efficient robot learning. We present a novel black-box policy search algorithm focused on data-efficient policy improvements. The algorithm learns directly on the robot and treats simulation as an additional information source to speed up the learning process. At the core of the algorithm, a probabilistic model learns the dependence of the policy parameters and the robot learning objective not only by performing experiments on the robot, but also by leveraging data from a simulator. This substantially reduces interaction time with the robot. Using this model, we can guarantee improvements with high probability for each policy update, thereby facilitating fast, goal-oriented learning. We evaluate our algorithm on simulated fine-tuning tasks and demonstrate the data-efficiency of the proposed dual-information source optimization algorithm. In a real robot learning experiment, we show fast and successful task learning on a robot manipulator with the aid of an imperfect simulator.

artificial intelligence, machine learning, robot, (15 more...)

arXiv.org Artificial Intelligence

2411.14246

Country: Europe (0.67)

Genre: Research Report (0.82)

Industry: Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Safe reinforcement learning in uncertain contexts

Baumann, Dominik, Schön, Thomas B.

arXiv.org Artificial IntelligenceJan-11-2024

When deploying machine learning algorithms in the real world, guaranteeing safety is an essential asset. Existing safe learning approaches typically consider continuous variables, i.e., regression tasks. However, in practice, robotic systems are also subject to discrete, external environmental changes, e.g., having to carry objects of certain weights or operating on frozen, wet, or dry surfaces. Such influences can be modeled as discrete context variables. In the existing literature, such contexts are, if considered, mostly assumed to be known. In this work, we drop this assumption and show how we can perform safe learning when we cannot directly measure the context variables. To achieve this, we derive frequentist guarantees for multi-class classification, allowing us to estimate the current context from measurements. Further, we propose an approach for identifying contexts through experiments. We discuss under which conditions we can retain theoretical guarantees and demonstrate the applicability of our algorithm on a Furuta pendulum with camera measurements of different weights that serve as contexts.

classification, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2401.05876

Country:

Europe > Sweden (0.47)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Personal (0.46)
Research Report (0.40)

Industry:

Automobiles & Trucks (0.67)
Government (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.82)

Add feedback

Non-ergodicity in reinforcement learning: robustness via ergodicity transformations

Baumann, Dominik, Noorani, Erfaun, Price, James, Peters, Ole, Connaughton, Colm, Schön, Thomas B.

arXiv.org Artificial IntelligenceOct-17-2023

Envisioned application areas for reinforcement learning (RL) include autonomous driving, precision agriculture, and finance, which all require RL agents to make decisions in the real world. A significant challenge hindering the adoption of RL methods in these domains is the non-robustness of conventional algorithms. In this paper, we argue that a fundamental issue contributing to this lack of robustness lies in the focus on the expected value of the return as the sole "correct" optimization objective. The expected value is the average over the statistical ensemble of infinitely many trajectories. For non-ergodic returns, this average differs from the average over a single but infinitely long trajectory. Consequently, optimizing the expected value can lead to policies that yield exceptionally high returns with probability zero but almost surely result in catastrophic outcomes. This problem can be circumvented by transforming the time series of collected returns into one with ergodic increments. This transformation enables learning robust policies by optimizing the long-term return for individual agents rather than the average across infinitely many trajectories. We propose an algorithm for learning ergodicity transformations from data and demonstrate its effectiveness in an instructive, non-ergodic environment and on standard RL benchmarks.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2310.11335

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report (0.50)

Industry:

Food & Agriculture > Agriculture (0.54)
Leisure & Entertainment > Games (0.46)
Transportation > Ground > Road (0.34)
Information Technology > Robotics & Automation (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A computationally lightweight safe learning algorithm

Baumann, Dominik, Kowalczyk, Krzysztof, Tiels, Koen, Wachel, Paweł

arXiv.org Artificial IntelligenceSep-7-2023

Safety is an essential asset when learning control policies for physical systems, as violating safety constraints during training can lead to expensive hardware damage. In response to this need, the field of safe learning has emerged with algorithms that can provide probabilistic safety guarantees without knowledge of the underlying system dynamics. Those algorithms often rely on Gaussian process inference. Unfortunately, Gaussian process inference scales cubically with the number of data points, limiting applicability to high-dimensional and embedded systems. In this paper, we propose a safe learning algorithm that provides probabilistic safety guarantees but leverages the Nadaraya-Watson estimator instead of Gaussian processes. For the Nadaraya-Watson estimator, we can reach logarithmic scaling with the number of data points. We provide theoretical guarantees for the estimates, embed them into a safe learning algorithm, and show numerical experiments on a simulated seven-degrees-of-freedom robot manipulator.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2309.03672

Country: Europe (0.94)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

GoSafeOpt: Scalable Safe Exploration for Global Optimization of Dynamical Systems

Sukhija, Bhavya, Turchetta, Matteo, Lindner, David, Krause, Andreas, Trimpe, Sebastian, Baumann, Dominik

arXiv.org Artificial IntelligenceJun-12-2023

Learning optimal control policies directly on physical systems is challenging since even a single failure can lead to costly hardware damage. Most existing model-free learning methods that guarantee safety, i.e., no failures, during exploration are limited to local optima. A notable exception is the GoSafe algorithm, which, unfortunately, cannot handle high-dimensional systems and hence cannot be applied to most real-world dynamical systems. This work proposes GoSafeOpt as the first algorithm that can safely discover globally optimal policies for high-dimensional systems while giving safety and optimality guarantees. We demonstrate the superiority of GoSafeOpt over competing model-free safe learning methods on a robot arm that would be prohibitive for GoSafe.

artificial intelligence, gosafeopt, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.artint.2023.103922

2201.09562

Country:

North America > United States (0.46)
Europe > Switzerland (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Government > Regional Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the trade-off between event-based and periodic state estimation under bandwidth constraints

Baumann, Dominik, Schön, Thomas B.

arXiv.org Artificial IntelligenceApr-2-2023

Event-based methods carefully select when to transmit information to enable high-performance control and estimation over resource-constrained communication networks. However, they come at a cost. For instance, event-based communication induces a higher computational load and increases the complexity of the scheduling problem. Thus, in some cases, allocating available slots to agents periodically in circular order may be superior. In this article, we discuss, for a specific example, when the additional complexity of event-based methods is beneficial. We evaluate our analysis in a synthetical example and on 20 simulated cart-pole systems.

artificial intelligence, communication, communication slot, (16 more...)

arXiv.org Artificial Intelligence

2304.00559

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Networks (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

Add feedback

Learning by Doing: Controlling a Dynamical System using Causality, Control, and Reinforcement Learning

Weichwald, Sebastian, Mogensen, Søren Wengel, Lee, Tabitha Edith, Baumann, Dominik, Kroemer, Oliver, Guyon, Isabelle, Trimpe, Sebastian, Peters, Jonas, Pfister, Niklas

arXiv.org Machine LearningFeb-12-2022

Questions in causality, control, and reinforcement learning go beyond the classical machine learning task of prediction under i.i.d. observations. Instead, these fields consider the problem of learning how to actively perturb a system to achieve a certain effect on a response variable. Arguably, they have complementary views on the problem: In control, one usually aims to first identify the system by excitation strategies to then apply model-based design techniques to control the system. In (non-model-based) reinforcement learning, one directly optimizes a reward. In causality, one focus is on identifiability of causal structure. We believe that combining the different views might create synergies and this competition is meant as a first step toward such synergies. The participants had access to observational and (offline) interventional data generated by dynamical systems. Track CHEM considers an open-loop problem in which a single impulse at the beginning of the dynamics can be set, while Track ROBO considers a closed-loop problem in which control variables can be set at each time step. The goal in both tracks is to infer controls that drive the system to a desired state. Code is open-sourced ( https://github.com/LearningByDoingCompetition/learningbydoing-comp ) to reproduce the winning solutions of the competition and to facilitate trying out new methods on the competition tasks.

artificial intelligence, machine learning, reinforcement learning, (3 more...)

arXiv.org Machine Learning

2202.06052

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.80)

Add feedback

A Kernel Two-sample Test for Dynamical Systems

Solowjow, Friedrich, Baumann, Dominik, Fiedler, Christian, Jocham, Andreas, Seel, Thomas, Trimpe, Sebastian

arXiv.org Machine LearningFeb-25-2021

Evaluating whether data streams were generated by the same distribution is at the heart of many machine learning problems, e.g. to detect changes. This is particularly relevant for data generated by dynamical systems since they are essential for many real-world processes in biomedical, economic, or engineering systems. While kernel two-sample tests are powerful for comparing independent and identically distributed random variables, no established method exists for comparing dynamical systems. The key problem is the critical independence assumption, which is inherently violated in dynamical systems. We propose a novel two-sample test for dynamical systems by addressing three core challenges: we (i) introduce a novel notion of mixing that captures autocorrelations in a relevant metric, (ii) propose an efficient way to estimate the speed of mixing purely from data, and (iii) integrate these into established kernel-two sample tests. The result is a data-driven method for comparison of dynamical systems that is easy to use in practice and comes with sound theoretical guarantees. In an example application to anomaly detection from human walking data, we show that the test readily applies without the need for feature engineering, heuristics, and human expert knowledge.

dynamical systems, scientific computing, survey article, (19 more...)

arXiv.org Machine Learning

2004.11098

Country: Europe > Germany > Baden-Württemberg (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining > Anomaly Detection (0.68)

Add feedback