DASH: Modularized Human Manipulation Simulation with Vision and Language for Embodied AI

Jiang, Yifeng, Guo, Michelle, Li, Jiangshan, Exarchos, Ioannis, Wu, Jiajun, Liu, C. Karen

Aug-27-2021–arXiv.org Artificial Intelligence

Creating virtual humans with embodied, human-like perceptual and actuation constraints has the promise to provide an integrated simulation platform for many scientific and engineering applications. We present Dynamic and Autonomous Simulated Human (DASH), an embodied virtual human that, given natural language commands, performs grasp-and-stack tasks in a physically-simulated cluttered environment solely using its own visual perception, proprioception, and touch, without requiring human motion data. By factoring the DASH system into a vision module, a language module, and manipulation modules of two skill categories, we can mix and match analytical and machine learning techniques for different modules so that DASH is able to not only perform randomly arranged tasks with a high success rate, but also do so under anthropomorphic Figure 1: Our system, dynamic and autonomous simulated constraints and with fluid and diverse motions. The modular design human (DASH), is an embodied virtual human modeled off also favors analysis and extensibility to more complex manipulation of a child. DASH is able to manipulate tabletop objects with a skills.

deep learning, module, neural network, (18 more...)

arXiv.org Artificial Intelligence

Aug-27-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.46)

Genre:
- Research Report (0.50)

Industry:
- Leisure & Entertainment > Games (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science > Simulation of Human Behavior (1.00)
  - Machine Learning
    - Learning Graphical Models > Undirected Networks
      - Markov Models (0.46)
    - Neural Networks > Deep Learning (0.68)
  - Natural Language (1.00)