Goto

Collaborating Authors

 Lee, Alex


LGMCTS: Language-Guided Monte-Carlo Tree Search for Executable Semantic Object Rearrangement

arXiv.org Artificial Intelligence

We introduce a novel approach to the executable semantic object rearrangement problem. In this challenge, a robot seeks to create an actionable plan that rearranges objects within a scene according to a pattern dictated by a natural language description. Unlike existing methods such as StructFormer and StructDiffusion, which tackle the issue in two steps by first generating poses and then leveraging a task planner for action plan formulation, our method concurrently addresses pose generation and action planning. We achieve this integration using a Language-Guided Monte-Carlo Tree Search (LGMCTS). Quantitative evaluations are provided on two simulation datasets, and complemented by qualitative tests with a real robot.


Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control

arXiv.org Artificial Intelligence

Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains. We present a deep RL method that is practical for real-world robotics tasks, such as robotic manipulation, and generalizes effectively to never-before-seen tasks and objects. In these settings, ground truth reward signals are typically unavailable, and we therefore propose a self-supervised model-based approach, where a predictive model learns to directly predict the future from raw sensory readings, such as camera images. At test time, we explore three distinct goal specification methods: designated pixels, where a user specifies desired object manipulation tasks by selecting particular pixels in an image and corresponding goal positions, goal images, where the desired goal state is specified with an image, and image classifiers, which define spaces of goal states. Our deep predictive models are trained using data collected autonomously and continuously by a robot interacting with hundreds of objects, without human supervision. We demonstrate that visual MPC can generalize to never-before-seen objects---both rigid and deformable---and solve a range of user-defined object manipulation tasks using the same model.


Spectral Clustering with Brainstorming Process for Multi-View Data

AAAI Conferences

Clustering tasks often requires multiple views rather than a singleview to correctly reflect diverse characteristics of the cluster boundaries. The cluster boundaries estimated using a single view are incorrect in general, and those incorrect estimation should be compensated by helps of other views. If each viewis independent to other views, incorrect estimations will be mostly revised as the number of views grow. However, as the number of views grow, it is almost impossibleto avoid dependencies among views, and such dependencies often delude correct estimations. Thus, dependencies among views should be carefully considered in multi-view clustering. This paper proposes a new spectral clustering method to deal with multi-view data and dependencies among views. The proposed method is motivated by the brainstorming process. In the brainstorming process, an instance is regarded as an agenda to be discussed, while each view is considered as a brainstormer. Through the discussion step in the brainstorming process, a brainstormer iteratively suggests their opinions and accepts others’ different opinions. To compensate the biases caused by information sharing between brainstormers with dependent opinions, those having independent opinions are more encouraged to discuss together than those with dependent opinions. The conclusion step makes a compromise by merging or concatenating all opinions. The clustering is finally done after the conclusion. Experimental results in three tasks show the effectiveness of the proposed method comparing with ordinary single and multi-view spectral clusterings.