AITopics

2211.09786

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.68)

arXiv.org Artificial IntelligenceDec-9-2021

Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation

Simeonov, Anthony, Du, Yilun, Tagliasacchi, Andrea, Tenenbaum, Joshua B., Rodriguez, Alberto, Agrawal, Pulkit, Sitzmann, Vincent

We present Neural Descriptor Fields (NDFs), an object representation that encodes both points and relative poses between an object and a target (such as a robot gripper or a rack used for hanging) via category-level descriptors. We employ this representation for object manipulation, where given a task demonstration, we want to repeat the same task on a new object instance from the same category. We propose to achieve this objective by searching (via optimization) for the pose whose descriptor matches that observed in the demonstration. NDFs are conveniently trained in a self-supervised fashion via a 3D auto-encoding task that does not rely on expert-labeled keypoints. Further, NDFs are SE(3)-equivariant, guaranteeing performance that generalizes across all possible 3D object translations and rotations. We demonstrate learning of manipulation tasks from few (5-10) demonstrations both in simulation and on a real robot. Our performance generalizes across both object instances and 6-DoF object poses, and significantly outperforms a recent baseline that relies on 2D descriptors. Project website: https://yilundu.github.io/ndf/.

artificial intelligence, demonstration, machine learning, (18 more...)

2112.05124

Country:

North America > United States (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceNov-9-2021

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

Aceituno, Bernardo, Rodriguez, Alberto, Tulsiani, Shubham, Gupta, Abhinav, Mukadam, Mustafa

Specifying tasks with videos is a powerful technique towards acquiring novel and general robot skills. However, reasoning over mechanics and dexterous interactions can make it challenging to scale learning contact-rich manipulation. In this work, we focus on the problem of visual non-prehensile planar manipulation: given a video of an object in planar motion, find contact-aware robot actions that reproduce the same object motion. We propose a novel architecture, Differentiable Learning for Manipulation (\ours), that combines video decoding neural models with priors from contact mechanics by leveraging differentiable optimization and finite difference based simulation. Through extensive simulated experiments, we investigate the interplay between traditional model-based techniques and modern deep learning approaches. We find that our modular and fully differentiable architecture performs better than learning-only methods on unseen objects and motions. \url{https://github.com/baceituno/dlm}.

artificial intelligence, machine learning, manipulation, (19 more...)

2111.05318

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceDec-30-2020

Robotic Grasping of Fully-Occluded Objects using RF Perception

Boroushaki, Tara, Leng, Junshan, Clester, Ian, Rodriguez, Alberto, Adib, Fadel

We present the design, implementation, and evaluation of RF-Grasp, a robotic system that can grasp fully-occluded objects in unknown and unstructured environments. Unlike prior systems that are constrained by the line-of-sight perception of vision and infrared sensors, RF-Grasp employs RF (Radio Frequency) perception to identify and locate target objects through occlusions, and perform efficient exploration and complex manipulation tasks in non-line-of-sight settings. RF-Grasp relies on an eye-in-hand camera and batteryless RFID tags attached to objects of interest. It introduces two main innovations: (1) an RF-visual servoing controller that uses the RFID's location to selectively explore the environment and plan an efficient trajectory toward an occluded target, and (2) an RF-visual deep reinforcement learning network that can learn and execute efficient, complex policies for decluttering and grasping. We implemented and evaluated an end-to-end physical prototype of RF-Grasp and a state-of-the-art baseline. We demonstrate it improves success rate and efficiency by up to 40-50% in cluttered settings. We also demonstrate RF-Grasp in novel tasks such mechanical search of fully-occluded objects behind obstacles, opening up new possibilities for robotic manipulation.

artificial intelligence, reinforcement learning, rf-grasp, (19 more...)

2012.15436

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

arXiv.org Machine LearningApr-18-2019

Graph Element Networks: adaptive, structured computation and memory

Alet, Ferran, Jeewajee, Adarsh K., Bauza, Maria, Rodriguez, Alberto, Lozano-Perez, Tomas, Kaelbling, Leslie Pack

We explore the use of graph neural networks (GNNs) to model spatial processes in which there is a priori graphical structure. Similar to finite element analysis, we assign nodes of a GNN to spatial locations and use a computational process defined on the graph to model the relationship between an initial function defined over a space and a resulting function in the same space. We use GNNs as a computational substrate, and show that the locations of the nodes in space as well as their connectivity can be optimized to focus on the most complex parts of the space. Moreover, this representational strategy allows the learned input-output relationship to generalize over the size of the underlying space and run the same model at different levels of precision, trading computation for accuracy. We demonstrate this method on a traditional PDE problem, a physical prediction problem from robotics, and a problem of learning to predict scene images from novel viewpoints.

deep learning, neural network, node, (16 more...)

1904.09019

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceMar-27-2019

TossingBot: Learning to Throw Arbitrary Objects with Residual Physics

Zeng, Andy, Song, Shuran, Lee, Johnny, Rodriguez, Alberto, Funkhouser, Thomas

We investigate whether a robot arm can learn to pick and throw arbitrary objects into selected boxes quickly and accurately. Throwing has the potential to increase the physical reachability and picking speed of a robot arm. However, precisely throwing arbitrary objects in unstructured settings presents many challenges: from acquiring reliable pre-throw conditions (e.g. initial pose of object in manipulator) to handling varying object-centric properties (e.g. mass distribution, friction, shape) and dynamics (e.g. aerodynamics). In this work, we propose an end-to-end formulation that jointly learns to infer control parameters for grasping and throwing motion primitives from visual observations (images of arbitrary objects in a bin) through trial and error. Within this formulation, we investigate the synergies between grasping and throwing (i.e., learning grasps that enable more accurate throws) and between simulation and deep learning (i.e., using deep networks to predict residuals on top of control parameters predicted by a physics simulator). The resulting system, TossingBot, is able to grasp and throw arbitrary objects into boxes located outside its maximum reach range at 500+ mean picks per hour (600+ grasps per hour with 85% throwing accuracy); and generalizes to new objects and target locations. Videos are available at https://tossingbot.cs.princeton.edu

deep learning, neural network, tossingbot, (19 more...)

1903.11239

Genre: Research Report > New Finding (0.68)

Industry: Education (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

arXiv.org Machine LearningDec-19-2018

Modular meta-learning in abstract graph networks for combinatorial generalization

Alet, Ferran, Bauza, Maria, Rodriguez, Alberto, Lozano-Perez, Tomas, Kaelbling, Leslie P.

Modular meta-learning is a new framework that generalizes to unseen datasets by combining a small set of neural modules in different ways. In this work we propose abstract graph networks: using graphs as abstractions of a system's subparts without a fixed assignment of nodes to system subparts, for which we would need supervision. We combine this idea with modular meta-learning to get a flexible framework with combinatorial generalization to new tasks built in. We then use it to model the pushing of arbitrarily shaped objects from little or no training data.

artificial intelligence, arxiv preprint arxiv, neural network, (15 more...)

1812.07768

Country:

North America > United States > Massachusetts (0.14)
North America > Canada (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)

arXiv.org Machine LearningMar-27-2018

Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning

Zeng, Andy, Song, Shuran, Welker, Stefan, Lee, Johnny, Rodriguez, Alberto, Funkhouser, Thomas

Skilled robotic manipulation benefits from complex synergies between non-prehensile (e.g. pushing) and prehensile (e.g. grasping) actions: pushing can help rearrange cluttered objects to make space for arms and fingers; likewise, grasping can help displace objects to make pushing movements more precise and collision-free. In this work, we demonstrate that it is possible to discover and learn these synergies from scratch through model-free deep reinforcement learning. Our method involves training two fully convolutional networks that map from visual observations to actions: one infers the utility of pushes for a dense pixel-wise sampling of end effector orientations and locations, while the other does the same for grasping. Both networks are trained jointly in a Q-learning framework and are entirely self-supervised by trial and error, where rewards are provided from successful grasps. In this way, our policy learns pushing motions that enable future grasps, while learning grasps that can leverage past pushes. During picking experiments in both simulation and real-world scenarios, we find that our system quickly learns complex behaviors amid challenging cases of clutter, and achieves better grasping success rates and picking efficiencies than baseline alternatives after only a few hours of training. We further demonstrate that our method is capable of generalizing to novel objects. Qualitative results (videos), code, pre-trained models, and simulation environments are available at http://vpg.cs.princeton.edu

artificial intelligence, machine learning, reinforcement learning, (19 more...)

1803.09956

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningFeb-7-2018

GP-SUM. Gaussian Processes Filtering of non-Gaussian Beliefs

Bauza, Maria, Rodriguez, Alberto

This work studies the problem of stochastic dynamic filtering and state propagation with complex beliefs. The main contribution is GP-SUM, a filtering algorithm tailored to dynamic systems and observation models expressed as Gaussian processes (GP), that does not rely on linearizations or unimodal Gaussian approximations of the belief. The algorithm can be seen as a combination of a sampling-based filter and a probabilistic Bayes filter. GP-SUM operates by sampling the state distribution and propagating each sample through the dynamic system and observation models. Effective sampling and accurate probabilistic propagation are possible by relying on the GP form of the system, and a Gaussian mixture form of the belief. We show that GP-SUM outperforms several GP-Bayes and Particle Filters on a standard benchmark. We also illustrate its practical use in a pushing task, and demonstrate that it predicts heteroscedasticity, i.e., different amounts of uncertainty, and multi-modality when naturally occurring in pushing.

artificial intelligence, gp-sum, machine learning, (19 more...)

1709.0812

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.35)

arXiv.org Machine LearningSep-23-2017

A probabilistic data-driven model for planar pushing

Bauza, Maria, Rodriguez, Alberto

This paper presents a data-driven approach to model planar pushing interaction to predict both the most likely outcome of a push and its expected variability. The learned models rely on a variation of Gaussian processes with input-dependent noise called Variational Heteroscedastic Gaussian processes (VHGP) that capture the mean and variance of a stochastic function. We show that we can learn accurate models that outperform analytical models after less than 100 samples and saturate in performance with less than 1000 samples. We validate the results against a collected dataset of repeated trajectories, and use the learned models to study questions such as the nature of the variability in pushing, and the validity of the quasi-static assumption.

artificial intelligence, machine learning, variance, (19 more...)

1704.03033

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)