A Testbed for Learning by Demonstration from Natural Language and RGB-Depth Video

Song, Young Chol (University of Rochester) | Kautz, Henry (University of Rochester)

Jul-21-2012–AAAI Conferences

We are developing a testbed for learning by demonstration combining spoken language and sensor data in a natural real-world environment. Microsoft Kinect RGB-Depth cameras allow us to infer high-level visual features, such as the relative position of objects in space, with greater precision and less training than required by traditional systems. Speech is recognized and parsed using a “deep” parsing system, so that language features are available at the word, syntactic, and semantic levels. We collected an initial data set of 10 episodes of 7 individuals demonstrating how to “make tea”, and created a “gold standard” hand annotation of the actions performed in each. Finally, we are constructing “baseline” HMM-based activity recognition models using the visual and language features, in order to be ready to evaluate the performance of our future work on deeper and more structured models.

artificial intelligence, machine learning, natural language, (13 more...)

AAAI Conferences

Jul-21-2012

Conferences PDF

Add feedback

Country:
- North America > United States > New York > Monroe County > Rochester (0.05)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Grammars & Parsing (0.70)
  - Machine Learning > Learning Graphical Models
    - Undirected Networks > Markov Models (0.48)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found