AITopics | Garg, Animesh

Collaborating Authors

Garg, Animesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scaling Laws in Scientific Discovery with AI and Robot Scientists

Zhang, Pengsong, Zhang, Heng, Xu, Huazhe, Xu, Renjun, Wang, Zhenting, Wang, Cong, Garg, Animesh, Li, Zhibin, Ajoudani, Arash, Liu, Xinyu

arXiv.org Artificial IntelligenceApr-3-2025

Scientific discovery is poised for rapid advancement through advanced robotics and artificial intelligence. Current scientific practices face substantial limitations as manual experimentation remains time-consuming and resource-intensive, while multidisciplinary research demands knowledge integration beyond individual researchers' expertise boundaries. Here, we envision an autonomous generalist scientist (AGS) concept combines agentic AI and embodied robotics to automate the entire research lifecycle. This system could dynamically interact with both physical and virtual environments while facilitating the integration of knowledge across diverse scientific disciplines. By deploying these technologies throughout every research stage -- spanning literature review, hypothesis generation, experimentation, and manuscript writing -- and incorporating internal reflection alongside external feedback, this system aims to significantly reduce the time and resources needed for scientific discovery. Building on the evolution from virtual AI scientists to versatile generalist AI-based robot scientists, AGS promises groundbreaking potential. As these autonomous systems become increasingly integrated into the research process, we hypothesize that scientific discovery might adhere to new scaling laws, potentially shaped by the number and capabilities of these autonomous systems, offering novel perspectives on how knowledge is generated and evolves. The adaptability of embodied robots to extreme environments, paired with the flywheel effect of accumulating scientific knowledge, holds the promise of continually pushing beyond both physical and intellectual frontiers.

artificial intelligence, scientific discovery, survey article, (15 more...)

arXiv.org Artificial Intelligence

2503.22444

Genre:

Research Report (0.50)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)

Add feedback

Adapt3R: Adaptive 3D Scene Representation for Domain Transfer in Imitation Learning

Wilcox, Albert, Ghanem, Mohamed, Moghani, Masoud, Barroso, Pierre, Joffe, Benjamin, Garg, Animesh

arXiv.org Artificial IntelligenceMar-6-2025

Imitation Learning (IL) has been very effective in training robots to perform complex and diverse manipulation tasks. However, its performance declines precipitously when the observations are out of the training distribution. 3D scene representations that incorporate observations from calibrated RGBD cameras have been proposed as a way to improve generalizability of IL policies, but our evaluations in cross-embodiment and novel camera pose settings found that they show only modest improvement. To address those challenges, we propose Adaptive 3D Scene Representation (Adapt3R), a general-purpose 3D observation encoder which uses a novel architecture to synthesize data from one or more RGBD cameras into a single vector that can then be used as conditioning for arbitrary IL algorithms. The key idea is to use a pretrained 2D backbone to extract semantic information about the scene, using 3D only as a medium for localizing this semantic information with respect to the end-effector. We show that when trained end-to-end with several SOTA multi-task IL algorithms, Adapt3R maintains these algorithms' multi-task learning capacity while enabling zero-shot transfer to novel embodiments and camera poses. Furthermore, we provide a detailed suite of ablation and sensitivity experiments to elucidate the design space for point cloud observation encoders.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.04877

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
(2 more...)

Add feedback

AnyPlace: Learning Generalized Object Placement for Robot Manipulation

Zhao, Yuchi, Bogdanovic, Miroslav, Luo, Chengyuan, Tohme, Steven, Darvish, Kourosh, Aspuru-Guzik, Alán, Shkurti, Florian, Garg, Animesh

arXiv.org Artificial IntelligenceFeb-6-2025

Object placement in robotic tasks is inherently challenging due to the diversity of object geometries and placement configurations. To address this, we propose AnyPlace, a two-stage method trained entirely on synthetic data, capable of predicting a wide range of feasible placement poses for real-world tasks. Our key insight is that by leveraging a Vision-Language Model (VLM) to identify rough placement locations, we focus only on the relevant regions for local placement, which enables us to train the low-level placement-pose-prediction model to capture diverse placements efficiently. For training, we generate a fully synthetic dataset of randomly generated objects in different placement configurations (insertion, stacking, hanging) and train local placement-prediction models. We conduct extensive evaluations in simulation, demonstrating that our method outperforms baselines in terms of success rate, coverage of possible placement modes, and precision. In real-world experiments, we show how our approach directly transfers models trained purely on synthetic data to the real world, where it successfully performs placements in scenarios where other models struggle -- such as with varying object geometries, diverse placement modes, and achieving high precision for fine placement. More at: https://any-place.github.io.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2502.04531

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Robots > Manipulation (0.41)

Add feedback

Accelerating Discovery in Natural Science Laboratories with AI and Robotics: Perspectives and Challenges from the 2024 IEEE ICRA Workshop, Yokohama, Japan

Cooper, Andrew I., Courtney, Patrick, Darvish, Kourosh, Eckhoff, Moritz, Fakhruldeen, Hatem, Gabrielli, Andrea, Garg, Animesh, Haddadin, Sami, Harada, Kanako, Hein, Jason, Hübner, Maria, Knobbe, Dennis, Pizzuto, Gabriella, Shkurti, Florian, Shrestha, Ruja, Thurow, Kerstin, Vescovi, Rafael, Vogel-Heuser, Birgit, Wolf, Ádám, Yoshikawa, Naruki, Zeng, Yan, Zhou, Zhengxue, Zwirnmann, Henning

arXiv.org Artificial IntelligenceJan-12-2025

Fundamental breakthroughs across many scientific disciplines are becoming increasingly rare (1). At the same time, challenges related to the reproducibility and scalability of experiments, especially in the natural sciences (2,3), remain significant obstacles. For years, automating scientific experiments has been viewed as the key to solving this problem. However, existing solutions are often rigid and complex, designed to address specific experimental tasks with little adaptability to protocol changes. With advancements in robotics and artificial intelligence, new possibilities are emerging to tackle this challenge in a more flexible and human-centric manner.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.06847

Country:

North America (1.00)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.40)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations

Ameperosa, Ezra, Collins, Jeremy A., Jain, Mrinal, Garg, Animesh

arXiv.org Artificial IntelligenceNov-25-2024

Imitation learning in robotics faces significant challenges in generalization due to the complexity of robotic environments and the high cost of data collection. We introduce RoCoDA, a novel method that unifies the concepts of invariance, equivariance, and causality within a single framework to enhance data augmentation for imitation learning. RoCoDA leverages causal invariance by modifying task-irrelevant subsets of the environment state without affecting the policy's output. Simultaneously, we exploit SE(3) equivariance by applying rigid body transformations to object poses and adjusting corresponding actions to generate synthetic demonstrations. We validate RoCoDA through extensive experiments on five robotic manipulation tasks, demonstrating improvements in policy performance, generalization, and sample efficiency compared to state-of-the-art data augmentation methods. Our policies exhibit robust generalization to unseen object poses, textures, and the presence of distractors. Furthermore, we observe emergent behavior such as re-grasping, indicating policies trained with RoCoDA possess a deeper understanding of task dynamics. By leveraging invariance, equivariance, and causality, RoCoDA provides a principled approach to data augmentation in imitation learning, bridging the gap between geometric symmetries and causal reasoning.

artificial intelligence, augmentation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.16959

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

DiffSim2Real: Deploying Quadrupedal Locomotion Policies Purely Trained in Differentiable Simulation

Bagajo, Joshua, Schwarke, Clemens, Klemm, Victor, Georgiev, Ignat, Sleiman, Jean-Pierre, Tordesillas, Jesus, Garg, Animesh, Hutter, Marco

arXiv.org Artificial IntelligenceNov-4-2024

Abstract-- Differentiable simulators provide analytic gradients, enabling more sample-efficient learning algorithms and paving the way for data intensive learning tasks such as learning from images. In this work, we demonstrate that locomotion policies trained with analytic gradients from a differentiable simulator can be successfully transferred to the real world. Typically, simulators that offer informative gradients lack the physical accuracy needed for sim-to-real transfer, and viceversa. A key factor in our success is a smooth contact model that combines informative gradients with physical accuracy, ensuring effective transfer of learned behaviors. To the best of our knowledge, this is the first time a real quadrupedal robot is able to locomote after training exclusively in a differentiable simulation. The majority of Reinforcement Learning (RL) algorithms rely on Zeroth-order Gradient (ZoG) estimates during optimization, allowing the use of conventional physics simulators that are typically non-differentiable.

machine learning, reinforcement learning, simulation, (16 more...)

arXiv.org Artificial Intelligence

2411.02189

Country: Europe > Switzerland (0.29)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

Discovering Robotic Interaction Modes with Discrete Representation Learning

Wang, Liquan, Goyal, Ankit, Xu, Haoping, Garg, Animesh

arXiv.org Artificial IntelligenceOct-26-2024

Human actions manipulating articulated objects, such as opening and closing a drawer, can be categorized into multiple modalities we define as interaction modes. Traditional robot learning approaches lack discrete representations of these modes, which are crucial for empirical sampling and grounding. In this paper, we present ActAIM2, which learns a discrete representation of robot manipulation interaction modes in a purely unsupervised fashion, without the use of expert labels or simulator-based privileged information. Utilizing novel data collection methods involving simulator rollouts, ActAIM2 consists of an interaction mode selector and a low-level action predictor. The selector generates discrete representations of potential interaction modes with self-supervision, while the predictor outputs corresponding action trajectories. Our method is validated through its success rate in manipulating articulated objects and its robustness in sampling meaningful actions from the discrete representation. Extensive experiments demonstrate ActAIM2's effectiveness in enhancing manipulability and generalizability over baselines and ablation studies. For videos and additional results, see our website: https://actaim2.github.io/.

artificial intelligence, interaction mode, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.20258

Country: North America (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation

Zhou, Zihan, Garg, Animesh, Fox, Dieter, Garrett, Caelan, Mandlekar, Ajay

arXiv.org Artificial IntelligenceOct-23-2024

Robot learning has proven to be a general and effective technique for programming manipulators. Imitation learning is able to teach robots solely from human demonstrations but is bottlenecked by the capabilities of the demonstrations. Reinforcement learning uses exploration to discover better behaviors; however, the space of possible improvements can be too large to start from scratch. And for both techniques, the learning difficulty increases proportional to the length of the manipulation task. Accounting for this, we propose SPIRE, a system that first uses Task and Motion Planning (TAMP) to decompose tasks into smaller learning subproblems and second combines imitation and reinforcement learning to maximize their strengths. We develop novel strategies to train learning agents when deployed in the context of a planning system. We evaluate SPIRE on a suite of long-horizon and contact-rich robot manipulation problems. We find that SPIRE outperforms prior approaches that integrate imitation learning, reinforcement learning, and planning by 35% to 50% in average task performance, is 6 times more data efficient in the number of human demonstrations needed to train proficient agents, and learns to complete tasks nearly twice as efficiently. View https://sites.google.com/view/spire-corl-2024 for more details.

handoff section, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2410.18065

Genre: Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

CLIMB: Language-Guided Continual Learning for Task Planning with Iterative Model Building

Byrnes, Walker, Bogdanovic, Miroslav, Balakirsky, Avi, Balakirsky, Stephen, Garg, Animesh

arXiv.org Artificial IntelligenceOct-17-2024

Intelligent and reliable task planning is a core capability for generalized robotics, requiring a descriptive domain representation that sufficiently models all object and state information for the scene. We present CLIMB, a continual learning framework for robot task planning that leverages foundation models and execution feedback to guide domain model construction. CLIMB can build a model from a natural language description, learn non-obvious predicates while solving tasks, and store that information for future problems. We demonstrate the ability of CLIMB to improve performance in common planning environments compared to baseline methods. We also develop the BlocksWorld++ domain, a simulated environment with an easily usable real counterpart, together with a curriculum of tasks with progressing difficulty for evaluating continual learning. Additional details and demonstrations for this system can be found at https://plan-with-climb.github.io/ .

large language model, natural language, predicate, (17 more...)

arXiv.org Artificial Intelligence

2410.13756

Country: North America (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

PWM: Policy Learning with Large World Models

Georgiev, Ignat, Giridhar, Varun, Hansen, Nicklas, Garg, Animesh

arXiv.org Artificial IntelligenceJul-3-2024

Reinforcement Learning (RL) has achieved impressive results on complex tasks but struggles in multi-task settings with different embodiments. World models offer scalability by learning a simulation of the environment, yet they often rely on inefficient gradient-free optimization methods. We introduce Policy learning with large World Models (PWM), a novel model-based RL algorithm that learns continuous control policies from large multi-task world models. By pre-training the world model on offline data and using it for first-order gradient policy learning, PWM effectively solves tasks with up to 152 action dimensions and outperforms methods using ground-truth dynamics. Additionally, PWM scales to an 80-task setting, achieving up to 27% higher rewards than existing baselines without the need for expensive online planning. Visualizations and code available at https://www.imgeorgiev.com/pwm

machine learning, reinforcement learning, world model, (16 more...)

arXiv.org Artificial Intelligence

2407.02466

Country:

North America (0.14)
Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback