AITopics | Leoben

Collaborating Authors

Leoben

Regret Bounds for Learning State Representations in Reinforcement Learning

Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronan Fruit, Odalric-Ambrym Maillard

Neural Information Processing SystemsNov-17-2025, 23:12:46 GMT

Neural Information Processing Systems http://nips.cc/

machine learning, markov model, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
Europe > Austria > Styria > Leoben (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Transfer in Reinforcement Learning via Regret Bounds for Learning Agents

Tuynman, Adrienne, Ortner, Ronald

arXiv.org Artificial IntelligenceNov-14-2025

We present an approach for the quantification of the usefulness of transfer in reinforcement learning via regret bounds for a multi-agent setting. Considering a number of $\aleph$ agents operating in the same Markov decision process, however possibly with different reward functions, we consider the regret each agent suffers with respect to an optimal policy maximizing her average reward. We show that when the agents share their observations the total regret of all agents is smaller by a factor of $\sqrt{\aleph}$ compared to the case when each agent has to rely on the information collected by herself. This result demonstrates how considering the regret in multi-agent settings can provide theoretical bounds on the benefit of sharing observations in transfer learning.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2202.01182

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Europe > Austria > Styria > Leoben (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Beyond Master and Apprentice: Grounding Foundation Models for Symbiotic Interactive Learning in a Shared Latent Space

Nwankwo, Linus, Ellensohn, Björn, Rauch, Christian, Rueckert, Elmar

arXiv.org Artificial IntelligenceNov-10-2025

Today's autonomous agents can understand free-form natural language instructions and execute long-horizon tasks in a manner akin to human-level reasoning. These capabilities are mostly driven by large-scale pre-trained foundation models (FMs). However, the approaches with which these models are grounded for human-robot interaction (HRI) perpetuate a master-apprentice model, where the apprentice (embodied agent) passively receives and executes the master's (human's) commands without reciprocal learning. This reactive interaction approach does not capture the co-adaptive dynamics inherent in everyday multi-turn human-human interactions. To address this, we propose a Symbiotic Interactive Learning (SIL) approach that enables both the master and the apprentice to co-adapt through mutual, bidirectional interactions. We formalised SIL as a co-adaptation process within a shared latent task space, where the agent and human maintain joint belief states that evolve based on interaction history. This enables the agent to move beyond reactive execution to proactive clarification, adaptive suggestions, and shared plan refinement. To realise these novel behaviours, we leveraged pre-trained FMs for spatial perception and reasoning, alongside a lightweight latent encoder that grounds the models' outputs into task-specific representations. Furthermore, to ensure stability as the tasks evolve, we augment SIL with a memory architecture that prevents the forgetting of learned task-space representations. We validate SIL on both simulated and real-world embodied tasks, including instruction following, information retrieval, query-oriented reasoning, and interactive dialogues. Demos and resources are public at:~\href{https://linusnep.github.io/SIL/}{https://linusnep.github.io/SIL/}.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.05203

Country: Europe > Austria > Styria > Leoben (0.04)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.70)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

Improved Best-of-Both-Worlds Regret for Bandits with Delayed Feedback

Schlisselberg, Ofir, Lancewicki, Tal, Auer, Peter, Mansour, Yishay

arXiv.org Artificial IntelligenceOct-21-2025

We study the multi-armed bandit problem with adversarially chosen delays in the Best-of-Both-Worlds (BoBW) framework, which aims to achieve near-optimal performance in both stochastic and adversarial environments. While prior work has made progress toward this goal, existing algorithms suffer from significant gaps to the known lower bounds, especially in the stochastic settings. Our main contribution is a new algorithm that, up to logarithmic factors, matches the known lower bounds in each setting individually. In the adversarial case, our algorithm achieves regret of $\widetilde{O}(\sqrt{KT} + \sqrt{D})$, which is optimal up to logarithmic terms, where $T$ is the number of rounds, $K$ is the number of arms, and $D$ is the cumulative delay. In the stochastic case, we provide a regret bound which scale as $\sum_{i:Δ_i>0}\left(\log T/Δ_i\right) + \frac{1}{K}\sum Δ_i σ_{max}$, where $Δ_i$ is the sub-optimality gap of arm $i$ and $σ_{\max}$ is the maximum number of missing observations. To the best of our knowledge, this is the first BoBW algorithm to simultaneously match the lower bounds in both stochastic and adversarial regimes in delayed environment. Moreover, even beyond the BoBW setting, our stochastic regret bound is the first to match the known lower bound under adversarial delays, improving the second term over the best known result by a factor of $K$.

artificial intelligence, data mining, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2505.24193

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Europe > Austria > Styria > Leoben (0.04)
Asia > China (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.88)

Add feedback

Regret Bounds for Learning State Representations in Reinforcement Learning

Ronald Ortner, Matteo Pirotta, Alessandro Lazaric, Ronan Fruit, Odalric-Ambrym Maillard

Neural Information Processing SystemsOct-3-2025, 07:23:32 GMT

Neural Information Processing Systems http://nips.cc/

machine learning, markov model, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
North America > Canada (0.04)
Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
Europe > Austria > Styria > Leoben (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Good Deep Features to Track: Self-Supervised Feature Extraction and Tracking in Visual Odometry

Gottam, Sai Puneeth Reddy, Zhang, Haoming, Keras, Eivydas

arXiv.org Artificial IntelligenceSep-11-2025

Abstract--Visual-based localization has made significant progress, yet its performance often drops in large-scale, outdoor, and long-term settings due to factors like lighting changes, dynamic scenes, and low-texture areas. These challenges degrade feature extraction and tracking, which are critical for accurate motion estimation. While learning-based methods such as SuperPoint and SuperGlue show improved feature coverage and robustness, they still face generalization issues with out-of-distribution data. We address this by enhancing deep feature extraction and tracking through self-supervised learning with task-specific feedback. Our method promotes stable and informative features, improving generalization and reliability in challenging environments.

artificial intelligence, data mining, machine learning, (11 more...)

arXiv.org Artificial Intelligence

2509.08333

Country:

Asia > South Korea > Gyeongsangbuk-do > Pohang (0.07)
Europe > Austria > Styria > Leoben (0.05)
Europe > Italy > Lombardy > Milan (0.04)
(2 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Feature Extraction (0.85)

Add feedback

Real-Time 3D Vision-Language Embedding Mapping

Rauch, Christian, Ellensohn, Björn, Nwankwo, Linus, Dave, Vedant, Rueckert, Elmar

arXiv.org Artificial IntelligenceAug-11-2025

A. Vision-Language Models in Robotics In contrast to classic closed-set methods trained on specific labels, novel Vision-Language Models (VLMs) enable the open-set association of images with their text descriptions [6], [12], or other modalities [13], via a common embedding space, using individual transformers for image, text, or other modalities. VLMs have been used in robotics for open-set tracking of objects in the current camera FoV [14], for interactive pose estimation of relevant parts of tools [15], and for navigation via hand-drawn instructions [16]. By focusing on a single task and the current FoV, these approaches cannot generalise to other tasks or operate on a global level, such as localising tools outside the current FoV . In contrast, we integrate the open-set VLM embeddings in a task-agnostic 3D representation in order to enable a variety of interactive robotic use-cases on the same vision-language representation. B. Implicit Neural Representations Due to the availability of vast amounts of 2D images and text, Vision Transformers (ViT) are predominantly trained on 2D image data [17].

artificial intelligence, natural language, representation, (17 more...)

arXiv.org Artificial Intelligence

2508.06291

Country:

Europe > Austria > Styria > Leoben (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Multi-robot LiDAR SLAM: a practical case study in underground tunnel environments

Di Lauro, Federica, Sorrenti, Domenico G., Sotelo, Miguel Angel

arXiv.org Artificial IntelligenceAug-4-2025

Multi-robot SLAM aims at localizing and building a map with multiple robots, interacting with each other. In the work described in this article, we analyze the pipeline of a decentralized LiDAR SLAM system to study the current limitations of the state of the art, and we discover a significant source of failures, i.e., that the loop detection is the source of too many false positives. We therefore develop and propose a new heuristic to overcome these limitations. The environment taken as reference in this work is the highly challenging case of underground tunnels. We also highlight potential new research areas still under-explored.

artificial intelligence, graph, tunnel-filtered kf, (15 more...)

arXiv.org Artificial Intelligence

2507.21553

Country:

North America > United States (0.14)
Oceania > Australia > Queensland > Brisbane (0.04)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Austria > Styria > Leoben (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.34)

Add feedback

ViTaPEs: Visuotactile Position Encodings for Cross-Modal Alignment in Multimodal Transformers

Lygerakis, Fotios, Özdenizci, Ozan, Rückert, Elmar

arXiv.org Artificial IntelligenceMay-27-2025

Tactile sensing provides local essential information that is complementary to visual perception, such as texture, compliance, and force. Despite recent advances in visuotactile representation learning, challenges remain in fusing these modalities and generalizing across tasks and environments without heavy reliance on pre-trained vision-language models. Moreover, existing methods do not study positional encodings, thereby overlooking the multi-scale spatial reasoning needed to capture fine-grained visuotactile correlations. We introduce ViTaPEs, a transformer-based framework that robustly integrates visual and tactile input data to learn task-agnostic representations for visuotactile perception. Our approach exploits a novel multi-scale positional encoding scheme to capture intra-modal structures, while simultaneously modeling cross-modal cues. Unlike prior work, we provide provable guarantees in visuotactile fusion, showing that our encodings are injective, rigid-motion-equivariant, and information-preserving, validating these properties empirically. Experiments on multiple large-scale real-world datasets show that ViTaPEs not only surpasses state-of-the-art baselines across various recognition tasks but also demonstrates zero-shot generalization to unseen, out-of-domain scenarios. We further demonstrate the transfer-learning strength of ViTaPEs in a robotic grasping task, where it outperforms state-of-the-art baselines in predicting grasp success. Project page: https://sites.google.com/view/vitapes

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.20032

Country:

Europe > Austria > Styria > Leoben (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > Austria > Styria > Graz (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Adversarially Robust Spiking Neural Networks with Sparse Connectivity

Schmolli, Mathias, Baronig, Maximilian, Legenstein, Robert, Özdenizci, Ozan

arXiv.org Artificial IntelligenceMay-23-2025

Deployment of deep neural networks in resource-constrained embedded systems requires innovative algorithmic solutions to facilitate their energy and memory efficiency. To further ensure the reliability of these systems against malicious actors, recent works have extensively studied adversarial robustness of existing architectures. Our work focuses on the intersection of adversarial robustness, memory- and energy-efficiency in neural networks. We introduce a neural network conversion algorithm designed to produce sparse and adversarially robust spiking neural networks (SNNs) by leveraging the sparse connectivity and weights from a robustly pretrained artificial neural network (ANN). Our approach combines the energy-efficient architecture of SNNs with a novel conversion algorithm, leading to state-of-the-art performance with enhanced energy and memory efficiency through sparse connectivity and activations. Our models are shown to achieve up to 100x reduction in the number of weights to be stored in memory, with an estimated 8.6x increase in energy efficiency compared to dense SNNs, while maintaining high performance and robustness against adversarial threats.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.15833

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Austria > Styria > Graz (0.04)
Europe > Austria > Styria > Leoben (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback