This paper argues that complex, embedded software agent systems are best constructed with parallel, layered architectures. These systems resemble Minskian Societies of Mind and Brooksian subsumption controllers for robots, and they demonstrate that complex behaviors can be had via the aggregates of relatively simple interacting agents. We illustrate this principle with a distributed software agent system that controls the behavior of our laboratory's Intelligent Room.
Farrell, Robert G. (IBM Research) | Lenchner, Jonathan (IBM Research) | Kephjart, Jeffrey O. (IBM Research) | Webb, Alan M. (IBM Research) | Muller, MIchael J. (IBM Research) | Erikson, Thomas D. (IBM Research) | Melville, David O. (IBM Research) | Bellamy, Rachel K.E. (IBM Research) | Gruen, Daniel M. (IBM Research) | Connell, Jonathan H. (IBM Research) | Soroker, Danny (IBM Research) | Aaron, Andy (IBM Research) | Trewin, Shari M. (IBM Research) | Ashoori, Maryam (IBM Research) | Ellis, Jason B. (IBM Research) | Gaucher, Brian P. (IBM Research) | Gil, Dario (IBM Research)
IBM Research is engaged in a research program in symbiotic cognitive computing to investigate how to embed cognitive computing in physical spaces. This article proposes 5 key principles of symbiotic cognitive computing. We describe how these principles are applied in a particular symbiotic cognitive computing environment and in an illustrative application.
AI researchers are interested in building intelligent machines that can interact with them as they interact with each other. Science fiction writers have given us these goals in the form of HAL in 2001: A Space Odyssey and Commander Data in Star Trek: The Next Generation. However, at present, our computers are deaf, dumb, and blind, almost unaware of the environment they are in and of the user who interacts with them. In this article, I present the current state of the art in machines that can see people, recognize them, determine their gaze, understand their facial expressions and hand gestures, and interpret their activities. I believe that by building machines with such abilities for perceiving, people will take us one step closer to building HAL and Commander Data.
The Visualization Space ("VizSpace") is a visual computing system created as a testbed for deviceless multimodal user interfaces. Continuous voice recognition and passive machine vision provide two channels of interaction with computer graphics imagery on a wall-sized display. Users gesture (e.g., point) and speak commands to manipulate and navigate through virtual objects and worlds. Voiced commands are combined with several types of gestures-fullbody, deictic, symbolic and iconic-to allow users to interact using only these natural human-to-human communication skills. The system is implemented on a single (high-end) IBM PC, yet provides comfortably interactive rates. It allows for rapid testing of voice/vision multimodal input and rapid prototyping of specific multimodal applications for natural interaction. Introduction Humans discover and understand their world through interactive visual sensations.