AITopics | Kragic, Danica

Plotting

Kragic, Danica

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Grasping a Handful: Sequential Multi-Object Dexterous Grasp Generation

Lu, Haofei, Dong, Yifei, Weng, Zehang, Lundell, Jens, Kragic, Danica

arXiv.org Artificial IntelligenceMar-31-2025

-- We introduce the sequential multi-object robotic grasp sampling algorithm SeqGrasp that can robustly synthesize stable grasps on diverse objects using the robotic hand's partial Degrees of Freedom (DoF). We use SeqGrasp to construct the large-scale Allegro Hand sequential grasping dataset SeqDataset and use it for training the diffusion-based sequential grasp generator SeqDiffuser . We experimentally evaluate SeqGrasp and SeqDiffuser against the state-of-the-art non-sequential multi-object grasp generation method Multi-Grasp in simulation and on a real robot. Furthermore, SeqDiffuser is approximately 1000 times faster at generating grasps than SeqGrasp and MultiGrasp. Generation of dexterous grasps has been studied for a long time, both from a technical perspective on generating grasps on robots [1]-[11] and understanding human grasping [12]- [15]. Most of these methods rely on bringing the robotic hand close to the object and then simultaneously enveloping it with all fingers. While this strategy often results in efficient and successful grasp generation, it simplifies dexterous grasping to resemble parallel-jaw grasping, thereby underutilizing the many DoF of multi-fingered robotic hands [10]. In contrast, grasping multiple objects with a robotic hand, particularly in a sequential manner that mirrors human-like dexterity, as shown in Figure 1, is still an unsolved problem. In this work, we introduce SeqGrasp, a novel hand-agnostic algorithm for generating sequential multi-object grasps.

artificial intelligence, seqdataset, seqdiffuser, (11 more...)

arXiv.org Artificial Intelligence

2503.2237

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Robots > Manipulation (1.00)

Add feedback

Pushing Everything Everywhere All At Once: Probabilistic Prehensile Pushing

Perugini, Patrizio, Lundell, Jens, Friedl, Katharina, Kragic, Danica

arXiv.org Artificial IntelligenceMar-18-2025

We address prehensile pushing, the problem of manipulating a grasped object by pushing against the environment. Our solution is an efficient nonlinear trajectory optimization problem relaxed from an exact mixed integer non-linear trajectory optimization formulation. The critical insight is recasting the external pushers (environment) as a discrete probability distribution instead of binary variables and minimizing the entropy of the distribution. The probabilistic reformulation allows all pushers to be used simultaneously, but at the optimum, the probability mass concentrates onto one due to the entropy minimization. We numerically compare our method against a state-of-the-art sampling-based baseline on a prehensile pushing task. The results demonstrate that our method finds trajectories 8 times faster and at a 20 times lower cost than the baseline. Finally, we demonstrate that a simulated and real Franka Panda robot can successfully manipulate different objects following the trajectories proposed by our method. Supplementary materials are available at https://probabilistic-prehensile-pushing.github.io/.

constraint, manipulation, pusher, (15 more...)

arXiv.org Artificial Intelligence

2503.14268

Country:

Europe > Belgium (0.14)
Europe > Sweden (0.14)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

FLAME: A Federated Learning Benchmark for Robotic Manipulation

Betran, Santiago Bou, Longhini, Alberta, Vasco, Miguel, Zhang, Yuchong, Kragic, Danica

arXiv.org Artificial IntelligenceMar-3-2025

Recent progress in robotic manipulation has been fueled by large-scale datasets collected across diverse environments. Training robotic manipulation policies on these datasets is traditionally performed in a centralized manner, raising concerns regarding scalability, adaptability, and data privacy. While federated learning enables decentralized, privacy-preserving training, its application to robotic manipulation remains largely unexplored. We introduce FLAME (Federated Learning Across Manipulation Environments), the first benchmark designed for federated learning in robotic manipulation. FLAME consists of: (i) a set of large-scale datasets of over 160,000 expert demonstrations of multiple manipulation tasks, collected across a wide range of simulated environments; (ii) a training and evaluation framework for robotic policy learning in a federated setting. We evaluate standard federated learning algorithms in FLAME, showing their potential for distributed policy learning and highlighting key challenges. Our benchmark establishes a foundation for scalable, adaptive, and privacy-aware robotic learning.

artificial intelligence, federated learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.01729

Country:

Europe > Sweden (0.14)
Asia > China (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (0.49)
Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

S$^2$-Diffusion: Generalizing from Instance-level to Category-level Skills in Robot Manipulation

Yang, Quantao, Welle, Michael C., Kragic, Danica, Andersson, Olov

arXiv.org Artificial IntelligenceFeb-17-2025

Recent advances in skill learning has propelled robot manipulation to new heights by enabling it to learn complex manipulation tasks from a practical number of demonstrations. However, these skills are often limited to the particular action, object, and environment \textit{instances} that are shown in the training data, and have trouble transferring to other instances of the same category. In this work we present an open-vocabulary Spatial-Semantic Diffusion policy (S$^2$-Diffusion) which enables generalization from instance-level training data to category-level, enabling skills to be transferable between instances of the same category. We show that functional aspects of skills can be captured via a promptable semantic module combined with a spatial representation. We further propose leveraging depth estimation networks to allow the use of only a single RGB camera. Our approach is evaluated and compared on a diverse number of robot manipulation tasks, both in simulation and in the real world. Our results show that S$^2$-Diffusion is invariant to changes in category-irrelevant factors as well as enables satisfying performance on other instances within the same category, even if it was not trained on that specific instance. Full videos of all real-world experiments are available in the supplementary material.

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Artificial Intelligence

2502.09389

Country: Europe (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots > Manipulation (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.66)

Add feedback

Early Detection of Human Handover Intentions in Human-Robot Collaboration: Comparing EEG, Gaze, and Hand Motion

Khanna, Parag, Rajabi, Nona, Kanik, Sumeyra U. Demir, Kragic, Danica, Björkman, Mårten, Smith, Christian

arXiv.org Artificial IntelligenceFeb-17-2025

Human-robot collaboration (HRC) relies on accurate and timely recognition of human intentions to ensure seamless interactions. Among common HRC tasks, human-to-robot object handovers have been studied extensively for planning the robot's actions during object reception, assuming the human intention for object handover. However, distinguishing handover intentions from other actions has received limited attention. Most research on handovers has focused on visually detecting motion trajectories, which often results in delays or false detections when trajectories overlap. This paper investigates whether human intentions for object handovers are reflected in non-movement-based physiological signals. We conduct a multimodal analysis comparing three data modalities: electroencephalogram (EEG), gaze, and hand-motion signals. Our study aims to distinguish between handover-intended human motions and non-handover motions in an HRC setting, evaluating each modality's performance in predicting and classifying these actions before and after human movement initiation. We develop and evaluate human intention detectors based on these modalities, comparing their accuracy and timing in identifying handover intentions. To the best of our knowledge, this is the first study to systematically develop and test intention detectors across multiple modalities within the same experimental context of human-robot handovers. Our analysis reveals that handover intention can be detected from all three modalities. Nevertheless, gaze signals are the earliest as well as the most accurate to classify the motion as intended for handover or non-handover.

artificial intelligence, machine learning, participant, (16 more...)

arXiv.org Artificial Intelligence

2502.11752

Country: Europe (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.92)

Add feedback

LLM-Driven Augmented Reality Puppeteer: Controller-Free Voice-Commanded Robot Teleoperation

Zhang, Yuchong, Orthmann, Bastian, Welle, Michael C., Van Haastregt, Jonne, Kragic, Danica

arXiv.org Artificial IntelligenceFeb-13-2025

The integration of robotics and augmented reality (AR) presents transformative opportunities for advancing human-robot interaction (HRI) by improving usability, intuitiveness, and accessibility. This work introduces a controller-free, LLM-driven voice-commanded AR puppeteering system, enabling users to teleoperate a robot by manipulating its virtual counterpart in real time. By leveraging natural language processing (NLP) and AR technologies, our system -- prototyped using Meta Quest 3 -- eliminates the need for physical controllers, enhancing ease of use while minimizing potential safety risks associated with direct robot operation. A preliminary user demonstration successfully validated the system's functionality, demonstrating its potential for safer, more intuitive, and immersive robotic control.

artificial intelligence, large language model, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.09142

Country: Europe > Sweden (0.14)

Genre:

Research Report (0.82)
Overview (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Humans Co-exist, So Must Embodied Artificial Agents

Kuehn, Hannah, La Delfa, Joseph, Vasco, Miguel, Kragic, Danica, Leite, Iolanda

arXiv.org Artificial IntelligenceFeb-10-2025

Modern embodied artificial agents excel in static, predefined tasks but fall short in dynamic and long-term interactions with humans. On the other hand, humans can adapt and evolve continuously, exploiting the situated knowledge embedded in their environment and other agents, thus contributing to meaningful interactions. We introduce the concept of co-existence for embodied artificial agents and argues that it is a prerequisite for meaningful, long-term interaction with humans. We take inspiration from biology and design theory to understand how human and non-human organisms foster entities that co-exist within their specific niches. Finally, we propose key research directions for the machine learning community to foster co-existing embodied agents, focusing on the principles, hardware and learning methods responsible for shaping them.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2502.04809

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Human-Aligned Image Models Improve Visual Decoding from the Brain

Rajabi, Nona, Ribeiro, Antônio H., Vasco, Miguel, Taleb, Farzaneh, Björkman, Mårten, Kragic, Danica

arXiv.org Artificial IntelligenceFeb-5-2025

Decoding visual images from brain activity has significant potential for advancing brain-computer interaction and enhancing the understanding of human perception. Recent approaches align the representation spaces of images and brain activity to enable visual decoding. In this paper, we introduce the use of human-aligned image encoders to map brain signals to images. We hypothesize that these models more effectively capture perceptual attributes associated with the rapid visual stimuli presentations commonly used in visual brain data recording experiments. Our empirical results support this hypothesis, demonstrating that this simple modification improves image retrieval accuracy by up to 21% compared to state-of-the-art methods. Comprehensive experiments confirm consistent performance improvements across diverse EEG architectures, image encoders, alignment methods, participants, and brain imaging modalities.

artificial intelligence, encoder, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2502.03081

Country: Europe > Sweden (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision

Longhini, Alberta, Büsching, Marcel, Duisterhof, Bardienus P., Lundell, Jens, Ichnowski, Jeffrey, Björkman, Mårten, Kragic, Danica

arXiv.org Artificial IntelligenceJan-3-2025

Teaching robots to fold, drape, or manipulate deformable objects such as cloths is fundamental to unlock a variety of applications ranging from healthcare to domestic and industrial environments [1]. While considerable progress has been made in rigid-object manipulation, manipulating deformables poses unique challenges, including infinite-dimensional state spaces, complex physical dynamics, and state estimation of self-occluded configurations [2]. Specifically, the problem of state estimation has led existing works on visual manipulation to either rely exclusively on 2D images, overlooking the cloth's 3D structure [3, 4, 5], or to use 3D representations that neglect valuable information in RGB observations [6, 7, 8]. Prior work on cloth state estimation often relies on 3D particle-based representations derived from depth sensors, including graphs [9, 10] and point clouds [11]. While point clouds effectively capture the object's observable state, they lack comprehensive structural information [6].

artificial intelligence, machine learning, manipulation, (18 more...)

arXiv.org Artificial Intelligence

2501.01715

Country: Europe (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Energy (0.31)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

A Riemannian Framework for Learning Reduced-order Lagrangian Dynamics

Friedl, Katharina, Jaquier, Noémie, Lundell, Jens, Asfour, Tamim, Kragic, Danica

arXiv.org Artificial IntelligenceNov-29-2024

By incorporating physical consistency as inductive bias, deep neural networks display increased generalization capabilities and data efficiency in learning nonlinear dynamic models. However, the complexity of these models generally increases with the system dimensionality, requiring larger datasets, more complex deep networks, and significant computational effort. We propose a novel geometric network architecture to learn physically-consistent reduced-order dynamic parameters that accurately describe the original high-dimensional system behavior. This is achieved by building on recent advances in model-order reduction and by adopting a Riemannian perspective to jointly learn a non-linear structure-preserving latent space and the associated low-dimensional dynamics. Our approach enables accurate long-term predictions of the high-dimensional dynamics of rigid and deformable systems with increased data efficiency by inferring interpretable and physically plausible reduced Lagrangian models.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2410.18868

Country: Europe (0.92)

Genre: Research Report > New Finding (0.46)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback