MuTT: A Multimodal Trajectory Transformer for Robot Skills
Kienle, Claudius, Alt, Benjamin, Celik, Onur, Becker, Philipp, Katic, Darko, Jäkel, Rainer, Neumann, Gerhard
–arXiv.org Artificial Intelligence
High-level robot skills represent an increasingly popular paradigm in robot programming. However, configuring the skills' parameters for a specific task remains a manual and time-consuming endeavor. Existing approaches for learning or optimizing these parameters often require numerous real-world executions or do not work in dynamic environments. To address these challenges, we propose MuTT, a novel encoder-decoder transformer architecture designed to predict environment-aware executions of robot skills by integrating vision, trajectory, and robot skill parameters. Notably, we pioneer the fusion of vision and trajectory, introducing a novel trajectory projection. Furthermore, we illustrate MuTT's efficacy as a predictor when combined with a model-based robot skill optimizer. This approach facilitates the optimization of robot skill parameters for the current environment, without the need for real-world executions during optimization. Designed for compatibility with any representation of robot skills, MuTT demonstrates its versatility across three comprehensive experiments, showcasing superior performance across two different skill representations.
arXiv.org Artificial Intelligence
Aug-22-2024
- Country:
- Asia > Japan
- Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- Europe > Germany
- Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States
- Massachusetts > Middlesex County > Cambridge (0.14)
- Asia > Japan
- Genre:
- Research Report (0.52)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Evolutionary Systems (0.68)
- Neural Networks > Deep Learning (0.34)
- Reinforcement Learning (0.69)
- Representation & Reasoning > Optimization (0.67)
- Robots (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence