Written instructions are a common way of teaching people how to accomplish tasks on the web. However, studies have shown that written instructions are difficult to follow, even for experienced users. A system that understands human-written instructions could guide users through the process of following the directions, improving completion rates and enhancing the user experience. While general natural language understanding is extremely difficult, we believe that in the limited domain of how-to instructions it should be possible to understand enough to provide guided help in a mixed-initiative environment. Based on a qualitative analysis of instructions gathered for 43 web-based tasks, we have formalized the problem of understanding and interpreting how-to instructions. We compare three different approaches to interpreting instructions: a keyword-based interpreter, a grammar-based interpreter, and an interpreter based on machine learning and information extraction. Our empirical results demonstrate the feasibility of automated how-to instruction understanding.
The ability to understand natural-language instructions is critical to building intelligent agents that interact with humans. We present a system that learns to transform natural-language navigation instructions into executable formal plans. Given no prior linguistic knowledge, the system learns by simply observing how humans follow navigation instructions. The system is evaluated in three complex virtual indoor environments with numerous objects and landmarks. A previously collected realistic corpus of complex English navigation instructions for these environments is used for training and testing data. By using a learned lexicon to refine inferred plans and a supervised learner to induce a semantic parser, the system is able to automatically learnto correctly interpret a reasonable fraction of the complex instructions in this corpus.
An instruction is adequate if its action(s) and objects are identified sufficiently and unambiguously, given the instruction's context. For instance, the instruction Turn the knob would be inadequate if, in the context, more than one knob or one way of turning a knob were salient. However, even if the knob and the manner of turning were uniquely identifiable, the instruction could still be inadequate since it does not tell the reader when to stop turning the knob. What is missing here is the termination information for the action, or when the performance of the action is to end. Conveying such information in automated text generation is the focus of my research.
The guides use AI technology to translate visual information into voice-commanded and Braille instructions. Lego has developed a set of instructions for the visually impaired, in an attempt to make the experience "more accessible". The Lego Audio & Braille Building Instructions were inspired by Matthew Shifrin, who was born blind. While Shifrin always enjoyed playing with the building blocks, he needed assistance when it came to specific instructions. His friend, Lilya, wrote down the building steps so that he could upload them into a system which allowed him to read the instructions on a Braille reader.
Machine translation (MT) was one of the first applications of artificial intelligence technology that was deployed to solve real-world problems. Since the early 1960s, researchers have been building and utilizing computer systems that can translate from one language to another without requiring extensive human intervention. In the late 1990s, Ford Vehicle Operations began working with Systran Software Inc. to adapt and customize its machine-translation technology in order to translate Ford's vehicle assembly build instructions from English to German, Spanish, Dutch, and Portuguese. The use of machine translation was made necessary by the vast amount of dynamic information that needed to be translated in a timely fashion. The assembly build instructions at Ford contain text written in a controlled language as well as unstructured remarks and comments.