AITopics | Peters, Jan

Collaborating Authors

Peters, Jan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Beyond the Cascade: Juggling Vanilla Siteswap Patterns

Andreu, Mario Gomez, Ploeger, Kai, Peters, Jan

arXiv.org Artificial IntelligenceOct-25-2024

Being widespread in human motor behavior, dynamic movements demonstrate higher efficiency and greater capacity to address a broader range of skill domains compared to their quasi-static counterparts. Among the frequently studied dynamic manipulation problems, robotic juggling tasks stand out due to their inherent ability to scale their difficulty levels to arbitrary extents, making them an excellent subject for investigation. In this study, we explore juggling patterns with mixed throw heights, following the vanilla siteswap juggling notation, which jugglers widely adopted to describe toss juggling patterns. This requires extending our previous analysis of the simpler cascade juggling task by a throw-height sequence planner and further constraints on the end effector trajectory. These are not necessary for cascade patterns but are vital to achieving patterns with mixed throw heights. Using a simulated environment, we demonstrate successful juggling of most common 3-9 ball siteswap patterns up to 9 ball height, transitions between these patterns, and random sequences covering all possible vanilla siteswap patterns with throws between 2 and 9 ball height. https://kai-ploeger.com/beyond-cascades

artificial intelligence, constraint, juggling, (17 more...)

arXiv.org Artificial Intelligence

2410.19591

Country: Europe > Germany (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Learning Force Distribution Estimation for the GelSight Mini Optical Tactile Sensor Based on Finite Element Analysis

Helmut, Erik, Dziarski, Luca, Funk, Niklas, Belousov, Boris, Peters, Jan

arXiv.org Artificial IntelligenceOct-8-2024

Contact-rich manipulation remains a major challenge in robotics. Optical tactile sensors like GelSight Mini offer a low-cost solution for contact sensing by capturing soft-body deformations of the silicone gel. However, accurately inferring shear and normal force distributions from these gel deformations has yet to be fully addressed. In this work, we propose a machine learning approach using a U-net architecture to predict force distributions directly from the sensor's raw images. Our model, trained on force distributions inferred from Finite Element Analysis (FEA), demonstrates promising accuracy in predicting normal and shear force distributions. It also shows potential for generalization across sensors of the same type and for enabling real-time application. The codebase, dataset and models are open-sourced and available at https://feats-ai.github.io .

artificial intelligence, force distribution, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2411.03315

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Unsupervised Skill Discovery for Robotic Manipulation through Automatic Task Generation

Jansonnie, Paul, Wu, Bingbing, Perez, Julien, Peters, Jan

arXiv.org Artificial IntelligenceOct-7-2024

Learning skills that interact with objects is of major importance for robotic manipulation. These skills can indeed serve as an efficient prior for solving various manipulation tasks. We propose a novel Skill Learning approach that discovers composable behaviors by solving a large and diverse number of autonomously generated tasks. Our method learns skills allowing the robot to consistently and robustly interact with objects in its environment. The discovered behaviors are embedded in primitives which can be composed with Hierarchical Reinforcement Learning to solve unseen manipulation tasks. In particular, we leverage Asymmetric Self-Play to discover behaviors and Multiplicative Compositional Policies to embed them. We compare our method to Skill Learning baselines and find that our skills are more interactive. Furthermore, the learned skills can be used to solve a set of unseen manipulation tasks, in simulation as well as on a real robotic platform.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2410.04855

Country:

Europe > Germany (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Add feedback

Uncertainty Representations in State-Space Layers for Deep Reinforcement Learning under Partial Observability

Luis, Carlos E., Bottero, Alessandro G., Vinogradska, Julia, Berkenkamp, Felix, Peters, Jan

arXiv.org Artificial IntelligenceSep-25-2024

Optimal decision-making under partial observability requires reasoning about the uncertainty of the environment's hidden state. However, most reinforcement learning architectures handle partial observability with sequence models that have no internal mechanism to incorporate uncertainty in their hidden state representation, such as recurrent neural networks, deterministic state-space models and transformers. Inspired by advances in probabilistic world models for reinforcement learning, we propose a standalone Kalman filter layer that performs closed-form Gaussian inference in linear state-space models and train it end-to-end within a model-free architecture to maximize returns. Similar to efficient linear recurrent layers, the Kalman filter layer processes sequential data using a parallel scan, which scales logarithmically with the sequence length. By design, Kalman filter layers are a drop-in replacement for other recurrent layers in standard model-free architectures, but importantly they include an explicit mechanism for probabilistic filtering of the latent state representation. Experiments in a wide variety of tasks with partial observability show that Kalman filter layers excel in problems where uncertainty reasoning is key for decision-making, outperforming other stateful models.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2409.16824

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Exciting Action: Investigating Efficient Exploration for Learning Musculoskeletal Humanoid Locomotion

Geiß, Henri-Jacques, Al-Hafez, Firas, Seyfarth, Andre, Peters, Jan, Tateo, Davide

arXiv.org Artificial IntelligenceJul-16-2024

Abstract-- Learning a locomotion controller for a musculoskeletal system is challenging due to over-actuation and highdimensional action space. While many reinforcement learning methods attempt to address this issue, they often struggle to learn human-like gaits because of the complexity involved in engineering an effective reward function. In this paper, we demonstrate that adversarial imitation learning can address this issue by analyzing key problems and providing solutions using both current literature and novel techniques. I. INTRODUCTION Locomotion on simulated musculoskeletal humanoids requires precise muscle activation patterns. Humanoid model with 16 DOFs actuated by 92 Muscle-Tendon Units during running (left) and walking (right).

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2407.11658

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Mower, Christopher E., Wan, Yuhui, Yu, Hongzhan, Grosnit, Antoine, Gonzalez-Billandon, Jonas, Zimmer, Matthieu, Wang, Jinlong, Zhang, Xinyu, Zhao, Yao, Zhai, Anbang, Liu, Puze, Palenicek, Daniel, Tateo, Davide, Cadena, Cesar, Hutter, Marco, Peters, Jan, Tian, Guangjian, Zhuang, Yuzheng, Shao, Kun, Quan, Xingyue, Hao, Jianye, Wang, Jun, Bou-Ammar, Haitham

arXiv.org Artificial IntelligenceJul-12-2024

We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback. Extensive experiments validate the framework, showcasing robustness, scalability, and versatility in diverse scenarios, including long-horizon tasks, tabletop rearrangements, and remote supervisory control. To facilitate the adoption of our framework and support the reproduction of our results, we have made our code open-source. You can access it at: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2406.19741

Country:

Asia > China (0.68)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Japan > Honshū (0.14)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

MoVEInt: Mixture of Variational Experts for Learning Human-Robot Interactions from Demonstrations

Prasad, Vignesh, Kshirsagar, Alap, Koert, Dorothea, Stock-Homburg, Ruth, Peters, Jan, Chalvatzaki, Georgia

arXiv.org Artificial IntelligenceJul-10-2024

Shared dynamics models are important for capturing the complexity and variability inherent in Human-Robot Interaction (HRI). Therefore, learning such shared dynamics models can enhance coordination and adaptability to enable successful reactive interactions with a human partner. In this work, we propose a novel approach for learning a shared latent space representation for HRIs from demonstrations in a Mixture of Experts fashion for reactively generating robot actions from human observations. We train a Variational Autoencoder (VAE) to learn robot motions regularized using an informative latent space prior that captures the multimodality of the human observations via a Mixture Density Network (MDN). We show how our formulation derives from a Gaussian Mixture Regression formulation that is typically used approaches for learning HRI from demonstrations such as using an HMM/GMM for learning a joint distribution over the actions of the human and the robot. We further incorporate an additional regularization to prevent "mode collapse", a common phenomenon when using latent space mixture models with VAEs. We find that our approach of using an informative MDN prior from human observations for a VAE generates more accurate robot motions compared to previous HMM-based or recurrent approaches of learning shared latent representations, which we validate on various HRI datasets involving interactions such as handshakes, fistbumps, waving, and handovers. Further experiments in a real-world human-to-robot handover scenario show the efficacy of our approach for generating successful interactions with four different human interaction partners.

artificial intelligence, interaction, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2024.3396074

2407.07636

Country: Europe > Germany (0.15)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.74)

Add feedback

Energy-based Contact Planning under Uncertainty for Robot Air Hockey

Jankowski, Julius, Marić, Ante, Liu, Puze, Tateo, Davide, Peters, Jan, Calinon, Sylvain

arXiv.org Artificial IntelligenceJul-4-2024

Planning robot contact often requires reasoning over a horizon to anticipate outcomes, making such planning problems computationally expensive. In this letter, we propose a learning framework for efficient contact planning in real-time subject to uncertain contact dynamics. We implement our approach for the example task of robot air hockey. Based on a learned stochastic model of puck dynamics, we formulate contact planning for shooting actions as a stochastic optimal control problem with a chance constraint on hitting the goal. To achieve online re-planning capabilities, we propose to train an energy-based model to generate optimal shooting plans in real time. The performance of the trained policy is validated %in experiments both in simulation and on a real-robot setup. Furthermore, our approach was tested in a competitive setting as part of the NeurIPS 2023 Robot Air Hockey Challenge.

artificial intelligence, machine learning, robot, (16 more...)

arXiv.org Artificial Intelligence

2407.03705

Country:

Europe > Switzerland (0.14)
Europe > Germany (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Sports > Hockey (0.85)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning

Vincent, Théo, Palenicek, Daniel, Belousov, Boris, Peters, Jan, D'Eramo, Carlo

arXiv.org Artificial IntelligenceMay-25-2024

The vast majority of Reinforcement Learning methods is largely impacted by the computation effort and data requirements needed to obtain effective estimates of action-value functions, which in turn determine the quality of the overall performance and the sample-efficiency of the learning procedure. Typically, action-value functions are estimated through an iterative scheme that alternates the application of an empirical approximation of the Bellman operator and a subsequent projection step onto a considered function space. It has been observed that this scheme can be potentially generalized to carry out multiple iterations of the Bellman operator at once, benefiting the underlying learning algorithm. However, till now, it has been challenging to effectively implement this idea, especially in high-dimensional problems. In this paper, we introduce iterated $Q$-Network (iQN), a novel principled approach that enables multiple consecutive Bellman updates by learning a tailored sequence of action-value functions where each serves as the target for the next. We show that iQN is theoretically grounded and that it can be seamlessly used in value-based and actor-critic methods. We empirically demonstrate the advantages of iQN in Atari $2600$ games and MuJoCo continuous control problems.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2403.02107

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

Vincent, Théo, Wahren, Fabian, Peters, Jan, Belousov, Boris, D'Eramo, Carlo

arXiv.org Artificial IntelligenceMay-25-2024

Deep Reinforcement Learning (RL) is well known for being highly sensitive to hyperparameters, requiring practitioners substantial efforts to optimize them for the problem at hand. In recent years, the field of automated Reinforcement Learning (AutoRL) has grown in popularity by trying to address this issue. However, these approaches typically hinge on additional samples to select well-performing hyperparameters, hindering sample-efficiency and practicality in RL. Furthermore, most AutoRL methods are heavily based on already existing AutoML methods, which were originally developed neglecting the additional challenges inherent to RL due to its non-stationarities. In this work, we propose a new approach for AutoRL, called Adaptive $Q$-Network (AdaQN), that is tailored to RL to take into account the non-stationarity of the optimization procedure without requiring additional samples. AdaQN learns several $Q$-functions, each one trained with different hyperparameters, which are updated online using the $Q$-function with the smallest approximation error as a shared target. Our selection scheme simultaneously handles different hyperparameters while coping with the non-stationarity induced by the RL optimization procedure and being orthogonal to any critic-based RL algorithm. We demonstrate that AdaQN is theoretically sound and empirically validate it in MuJoCo control problems, showing benefits in sample-efficiency, overall performance, training stability, and robustness to stochasticity.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2405.16195

Country:

Europe > Germany (0.14)
Oceania > Australia (0.14)
North America (0.14)

Genre: Research Report (0.50)

Industry: Government > Military (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback