AITopics | Grotz, Markus

Collaborating Authors

Grotz, Markus

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TetraGrip: Sensor-Driven Multi-Suction Reactive Object Manipulation in Cluttered Scenes

Torrado, Paolo, Levin, Joshua, Grotz, Markus, Smith, Joshua

arXiv.org Artificial IntelligenceMar-11-2025

Warehouse robotic systems equipped with vacuum grippers must reliably grasp a diverse range of objects from densely packed shelves. However, these environments present significant challenges, including occlusions, diverse object orientations, stacked and obstructed items, and surfaces that are difficult to suction. We introduce \tetra, a novel vacuum-based grasping strategy featuring four suction cups mounted on linear actuators. Each actuator is equipped with an optical time-of-flight (ToF) proximity sensor, enabling reactive grasping. We evaluate \tetra in a warehouse-style setting, demonstrating its ability to manipulate objects in stacked and obstructed configurations. Our results show that our RL-based policy improves picking success in stacked-object scenarios by 22.86\% compared to a single-suction gripper. Additionally, we demonstrate that TetraGrip can successfully grasp objects in scenarios where a single-suction gripper fails due to physical limitations, specifically in two cases: (1) picking an object occluded by another object and (2) retrieving an object in a complex scenario. These findings highlight the advantages of multi-actuated, suction-based grasping in unstructured warehouse environments. The project website is available at: \href{https://tetragrip.github.io/}{https://tetragrip.github.io/}.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.08978

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation

Fang, Haoquan, Grotz, Markus, Pumacay, Wilbert, Wang, Yi Ru, Fox, Dieter, Krishna, Ranjay, Duan, Jiafei

arXiv.org Artificial IntelligenceFeb-11-2025

Robotic manipulation systems operating in diverse, dynamic environments must exhibit three critical abilities: multitask interaction, generalization to unseen scenarios, and spatial memory. While significant progress has been made in robotic manipulation, existing approaches often fall short in generalization to complex environmental variations and addressing memory-dependent tasks. To bridge this gap, we introduce SAM2Act, a multi-view robotic transformer-based policy that leverages multi-resolution upsampling with visual representations from large-scale foundation model. SAM2Act achieves a state-of-the-art average success rate of 86.8% across 18 tasks in the RLBench benchmark, and demonstrates robust generalization on The Colosseum benchmark, with only a 4.3% performance gap under diverse environmental perturbations. Building on this foundation, we propose SAM2Act+, a memory-based architecture inspired by SAM2, which incorporates a memory bank, an encoder, and an attention mechanism to enhance spatial memory. To address the need for evaluating memory-dependent tasks, we introduce MemoryBench, a novel benchmark designed to assess spatial memory and action recall in robotic manipulation. SAM2Act+ achieves competitive performance on MemoryBench, significantly outperforming existing approaches and pushing the boundaries of memory-enabled robotic systems. Project page: https://sam2act.github.io/

machine learning, reinforcement learning, sam2act, (17 more...)

arXiv.org Artificial Intelligence

2501.18564

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)
(2 more...)

Add feedback

OptiGrasp: Optimized Grasp Pose Detection Using RGB Images for Warehouse Picking Robots

Atar, Soofiyan, Li, Yi, Grotz, Markus, Wolf, Michael, Fox, Dieter, Smith, Joshua

arXiv.org Artificial IntelligenceSep-28-2024

In warehouse environments, robots require robust picking capabilities to manage a wide variety of objects. Effective deployment demands minimal hardware, strong generalization to new products, and resilience in diverse settings. Current methods often rely on depth sensors for structural information, which suffer from high costs, complex setups, and technical limitations. Inspired by recent advancements in computer vision, we propose an innovative approach that leverages foundation models to enhance suction grasping using only RGB images. Trained solely on a synthetic dataset, our method generalizes its grasp prediction capabilities to real-world robots and a diverse range of novel objects not included in the training set. Our network achieves an 82.3\% success rate in real-world applications. The project website with code and data will be available at http://optigrasp.github.io.

artificial intelligence, optigrasp, survey article, (16 more...)

arXiv.org Artificial Intelligence

2409.19494

Genre:

Overview > Innovation (0.48)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (0.68)

Add feedback

PerAct2: A Perceiver Actor Framework for Bimanual Manipulation Tasks

Grotz, Markus, Shridhar, Mohit, Asfour, Tamim, Fox, Dieter

arXiv.org Artificial IntelligenceJun-28-2024

Humans seamlessly manipulate and interact with their environment using both hands. With both hands, humans achieve greater efficiency through enhanced reachability and can solve more sophisticated tasks. Despite the recent advances in grasping and manipulation planning [3, 4] the investigation of bimanual manipulation remains an under-explored area, especially in terms of learning a manipulation policy. Unlike tasks that require grasping or manipulation with a single hand, bimanual manipulation Figure 1: Selected bimanual tasks from the benchmark introduces a layer of complexity due to the as well as real-world examples. Due to the need for spatial and temporal coordination and architecture design the method can easily be transferred a deep understanding of the task at hand. This to other robots as the policy outputs a 6-D complexity is compounded by the dynamic nature pose and is agnostic to the underlying controller. of real-world tasks, where the state of the environment and the objects within it are constantly changing, demanding continuous adjustment and coordination between both arms.

artificial intelligence, manipulation, robot, (16 more...)

arXiv.org Artificial Intelligence

2407.00278

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.46)

Add feedback

STOW: Discrete-Frame Segmentation and Tracking of Unseen Objects for Warehouse Picking Robots

Li, Yi, Zhang, Muru, Grotz, Markus, Mo, Kaichun, Fox, Dieter

arXiv.org Artificial IntelligenceNov-4-2023

Segmentation and tracking of unseen object instances in discrete frames pose a significant challenge in dynamic industrial robotic contexts, such as distribution warehouses. Here, robots must handle object rearrangement, including shifting, removal, and partial occlusion by new items, and track these items after substantial temporal gaps. The task is further complicated when robots encounter objects not learned in their training sets, which requires the ability to segment and track previously unseen items. Considering that continuous observation is often inaccessible in such settings, our task involves working with a discrete set of frames separated by indefinite periods during which substantial changes to the scene may occur. This task also translates to domestic robotic applications, such as rearrangement of objects on a table. To address these demanding challenges, we introduce new synthetic and real-world datasets that replicate these industrial and household scenarios. We also propose a novel paradigm for joint segmentation and tracking in discrete frames along with a transformer module that facilitates efficient inter-frame communication. The experiments we conduct show that our approach significantly outperforms recent methods. For additional results and videos, please visit \href{https://sites.google.com/view/stow-corl23}{website}. Code and dataset will be released.

artificial intelligence, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2311.02337

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

A Memory System of a Robot Cognitive Architecture and its Implementation in ArmarX

Peller-Konrad, Fabian, Kartmann, Rainer, Dreher, Christian R. G., Meixner, Andre, Reister, Fabian, Grotz, Markus, Asfour, Tamim

arXiv.org Artificial IntelligenceJan-31-2023

Cognitive agents such as humans and robots perceive their environment through an abundance of sensors producing streams of data that need to be processed to generate intelligent behavior. A key question of cognition-enabled and AI-driven robotics is how to organize and manage knowledge efficiently in a cognitive robot control architecture. We argue, that memory is a central active component of such architectures that mediates between semantic and sensorimotor representations, orchestrates the flow of data streams and events between different processes and provides the components of a cognitive architecture with data-driven services for the abstraction of semantics from sensorimotor data, the parametrization of symbolic plans for execution and prediction of action effects. Based on related work, and the experience gained in developing our ARMAR humanoid robot systems, we identified conceptual and technical requirements of a memory system as central component of cognitive robot control architecture that facilitate the realization of high-level cognitive abilities such as explaining, reasoning, prospection, simulation and augmentation. Conceptually, a memory should be active, support multi-modal data representations, associate knowledge, be introspective, and have an inherently episodic structure. Technically, the memory should support a distributed design, be access-efficient and capable of long-term data storage. We introduce the memory system for our cognitive robot control architecture and its implementation in the robot software framework ArmarX. We evaluate the efficiency of the memory system with respect to transfer speeds, compression, reproduction and prediction capabilities.

artificial intelligence, information, survey article, (10 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.robot.2023.104415

2206.02241

Country: Europe > Germany (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (1.00)

Add feedback