AITopics | Treven, Lenart

Collaborating Authors

Treven, Lenart

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

As, Yarden, Sukhija, Bhavya, Treven, Lenart, Sferrazza, Carmelo, Coros, Stelian, Krause, Andreas

arXiv.org Artificial IntelligenceOct-12-2024

Reinforcement learning (RL) is ubiquitous in the development of modern AI systems. However, state-of-the-art RL agents require extensive, and potentially unsafe, interactions with their environments to learn effectively. These limitations confine RL agents to simulated environments, hindering their ability to learn directly in real-world settings. Despite the notable progress, its application without any use of simulators remains largely limited. This is primarily because, in many cases, RL methods require massive amounts of data for learning while also being inherently unsafe during exploration. In many real-world settings, environments are complex and rarely align exactly with the assumptions made in simulators. Learning directly in the real world allows RL systems to close the sim-to-real gap and continuously adapt to evolving environments and distribution shifts. However, to unlock these advantages, RL algorithms must be sample-efficient and ensure safety throughout the learning process to avoid costly failures or risks in high-stakes applications. For instance, agents learning driving policies in autonomous vehicles must prevent collisions with other cars or pedestrians, even when adapting to new driving environments. This challenge is known as safe exploration, where the agent's exploration is restricted by safety-critical, often unknown, constraints that must be satisfied throughout the learning process . Several works study safe exploration and have demonstrated state-of-the-art performance in terms of both safety and sample efficiency for learning in the real world (Sui et al., 2015; Wischnewski et al., 2019; Berkenkamp et al., 2021; Cooper & Netoff, 2022; Sukhija et al., 2023; Widmer et al., 2023). These methods maintain a "safe set" of policies during learning, selecting policies from this set to safely explore and gradually expand it. Under common regularity assumptions about the constraints, these approaches guarantee safety throughout learning. However, explictily maintaining and expanding a safe set, limits these methods to low-dimensional policies, such as PID controllers. This makes them difficult to scale to more complex tasks such as those considered in deep RL. To this end, we propose a scalable model-based RL algorithm - A CTS AFE - for efficient and safe exploration. Crucially, A CTS AFE learns an uncertainty-aware dynamics model, which it uses to implicitly define and expand the safe set of policies.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2410.09486

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Energy > Oil & Gas (0.46)
Leisure & Entertainment > Games (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Active Few-Shot Fine-Tuning

Hübotter, Jonas, Sukhija, Bhavya, Treven, Lenart, As, Yarden, Krause, Andreas

arXiv.org Artificial IntelligenceJun-21-2024

We study the question: How can we select the right data for fine-tuning to a specific task? We call this data selection problem active fine-tuning and show that it is an instance of transductive active learning, a novel generalization of classical active learning. We propose ITL, short for information-based transductive learning, an approach which samples adaptively to maximize information gained about the specified task. We are the first to show, under general regularity assumptions, that such decision rules converge uniformly to the smallest possible uncertainty obtainable from the accessible data. We apply ITL to the few-shot fine-tuning of large neural networks and show that fine-tuning with ITL learns the task with significantly fewer examples than the state-of-the-art.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.15441

Country:

North America > United States > Wisconsin (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Alberta (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.93)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL

Treven, Lenart, Sukhija, Bhavya, As, Yarden, Dörfler, Florian, Krause, Andreas

arXiv.org Artificial IntelligenceJun-4-2024

Reinforcement learning (RL) excels in optimizing policies for discrete-time Markov decision processes (MDP). However, various systems are inherently continuous in time, making discrete-time MDPs an inexact modeling choice. In many applications, such as greenhouse control or medical treatments, each interaction (measurement or switching of action) involves manual intervention and thus is inherently costly. Therefore, we generally prefer a time-adaptive approach with fewer interactions with the system. In this work, we formalize an RL framework, Time-adaptive Control & Sensing (TaCoS), that tackles this challenge by optimizing over policies that besides control predict the duration of its application. Our formulation results in an extended MDP that any standard RL algorithm can solve. We demonstrate that state-of-the-art RL algorithms trained on TaCoS drastically reduce the interaction amount over their discrete-time counterpart while retaining the same or improved performance, and exhibiting robustness over discretization frequency. Finally, we propose OTaCoS, an efficient model-based algorithm for our setting. We show that OTaCoS enjoys sublinear regret for systems with sufficiently smooth dynamics and empirically results in further sample-efficiency gains.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2406.01163

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry:

Food & Agriculture > Agriculture (0.67)
Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

NeoRL: Efficient Exploration for Nonepisodic RL

Sukhija, Bhavya, Treven, Lenart, Dörfler, Florian, Coros, Stelian, Krause, Andreas

arXiv.org Artificial IntelligenceJun-4-2024

We study the problem of nonepisodic reinforcement learning (RL) for nonlinear dynamical systems, where the system dynamics are unknown and the RL agent has to learn from a single trajectory, i.e., without resets. We propose Nonepisodic Optimistic RL (NeoRL), an approach based on the principle of optimism in the face of uncertainty. NeoRL uses well-calibrated probabilistic models and plans optimistically w.r.t. the epistemic uncertainty about the unknown dynamics. Under continuity and bounded energy assumptions on the system, we provide a first-of-its-kind regret bound of $\setO(\beta_T \sqrt{T \Gamma_T})$ for general nonlinear systems with Gaussian process dynamics. We compare NeoRL to other baselines on several deep RL environments and empirically demonstrate that NeoRL achieves the optimal average cost while incurring the least regret.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2406.01175

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Transductive Active Learning: Theory and Applications

Hübotter, Jonas, Sukhija, Bhavya, Treven, Lenart, As, Yarden, Krause, Andreas

arXiv.org Artificial IntelligenceMay-22-2024

We generalize active learning to address real-world settings with concrete prediction targets where sampling is restricted to an accessible region of the domain, while prediction targets may lie outside this region. We analyze a family of decision rules that sample adaptively to minimize uncertainty about prediction targets. We are the first to show, under general regularity assumptions, that such decision rules converge uniformly to the smallest possible uncertainty obtainable from the accessible data. We demonstrate their strong sample efficiency in two key applications: Active few-shot fine-tuning of large neural networks and safe Bayesian optimization, where they improve significantly upon the state-of-the-art.

artificial intelligence, learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.15898

Country:

North America > United States > Wisconsin (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > Canada > Alberta (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre:

Overview (0.92)
Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Education (0.92)
Health & Medicine (0.92)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

Bridging the Sim-to-Real Gap with Bayesian Inference

Rothfuss, Jonas, Sukhija, Bhavya, Treven, Lenart, Dörfler, Florian, Coros, Stelian, Krause, Andreas

arXiv.org Artificial IntelligenceMar-25-2024

We present SIM-FSVGD for learning robot dynamics from data. As opposed to traditional methods, SIM-FSVGD leverages low-fidelity physical priors, e.g., in the form of simulators, to regularize the training of neural network models. While learning accurate dynamics already in the low data regime, SIM-FSVGD scales and excels also when more data is available. We empirically show that learning with implicit physical priors results in accurate mean model estimation as well as precise uncertainty quantification. We demonstrate the effectiveness of SIM-FSVGD in bridging the sim-to-real gap on a high-performance RC racecar system. Using model-based RL, we demonstrate a highly dynamic parking maneuver with drifting, using less than half the data compared to the state of the art.

artificial intelligence, im -fsvgd, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2403.16644

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.62)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.83)

Add feedback

Efficient Exploration in Continuous-time Model-based Reinforcement Learning

Treven, Lenart, Hübotter, Jonas, Sukhija, Bhavya, Dörfler, Florian, Krause, Andreas

arXiv.org Artificial IntelligenceOct-30-2023

Reinforcement learning algorithms typically consider discrete-time dynamics, even though the underlying systems are often continuous in time. In this paper, we introduce a model-based reinforcement learning algorithm that represents continuous-time dynamics using nonlinear ordinary differential equations (ODEs). We capture epistemic uncertainty using well-calibrated probabilistic models, and use the optimistic principle for exploration. Our regret bounds surface the importance of the measurement selection strategy (MSS), since in continuous time we not only must decide how to explore, but also when to observe the underlying system. Our analysis demonstrates that the regret is sublinear when modeling ODEs with Gaussian Processes (GP) for common choices of MSS, such as equidistant sampling. Additionally, we propose an adaptive, data-dependent, practical MSS that, when combined with GP dynamics, also achieves sublinear regret with significantly fewer samples. We showcase the benefits of continuous-time modeling over its discrete-time counterpart, as well as our proposed adaptive MSS over standard baselines, on several applications.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2310.19848

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (0.46)
Transportation (0.46)
Information Technology > Robotics & Automation (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Optimistic Active Exploration of Dynamical Systems

Sukhija, Bhavya, Treven, Lenart, Sancaktar, Cansu, Blaes, Sebastian, Coros, Stelian, Krause, Andreas

arXiv.org Artificial IntelligenceOct-30-2023

Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model globally approximates the dynamics and allows us to solve multiple downstream tasks in a zero-shot manner? In this paper, we address this challenge, by developing an algorithm -- OPAX -- for active exploration. OPAX uses well-calibrated probabilistic models to quantify the epistemic uncertainty about the unknown dynamics. It optimistically -- w.r.t. to plausible dynamics -- maximizes the information gain between the unknown dynamics and state observations. We show how the resulting optimization problem can be reduced to an optimal control problem that can be solved at each episode using standard approaches. We analyze our algorithm for general models, and, in the case of Gaussian process dynamics, we give a first-of-its-kind sample complexity bound and show that the epistemic uncertainty converges to zero. In our experiments, we compare OPAX with other heuristic active exploration approaches on several environments. Our experiments show that OPAX is not only theoretically sound but also performs well for zero-shot planning on novel downstream tasks.

large language model, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2306.12371

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

Add feedback

Iterative Correction of Sensor Degradation and a Bayesian Multi-Sensor Data Fusion Method

Kolar, Luka, Šikonja, Rok, Treven, Lenart

arXiv.org Machine LearningSep-7-2020

We present a novel method for inferring ground-truth signal from multiple degraded signals, affected by different amounts of sensor "exposure". The algorithm learns a multiplicative degradation effect by performing iterative corrections of two signals solely from the ratio between them. The degradation function d should be continuous, satisfy monotonicity, and d(0) 1. We use smoothed monotonic regression method, where we easily incorporate the aforementioned criteria to the fitting part. We include theoretical analysis and prove convergence to the ground-truth signal for the noiseless measurement model. Lastly, we present an approach to fuse the noisy corrected signals using Gaussian processes. We use sparse Gaussian processes that can be utilized for a large number of measurements together with a specialized kernel that enables the estimation of noise values of all sensors. The data fusion framework naturally handles data gaps and provides a simple and powerful method for observing the signal trends on multiple timescales (long-term and short-term signal properties). The viability of correction method is evaluated on a synthetic dataset with known ground-truth signal.

artificial intelligence, ground-truth signal, information fusion, (16 more...)

arXiv.org Machine Learning

2009.03091

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback