Plotting

 Media


Constructionist Design Methodology for Interactive Intelligences

AI Magazine

We present a methodology for designing and implementing interactive intelligences. The constructionist design methodology (CDM) -- so called because it advocates modular building blocks and incorporation of prior work -- addresses factors that we see as key to future advances in AI, including support for interdisciplinary collaboration, coordination of teams, and large-scale systems integration. We test the methodology by building an interactive multifunctional system with a real-time perception- action loop. The system, whose construction relied entirely on the methodology, consists of an embodied virtual agent that can perceive both real and virtual objects in an augmented-reality room and interact with a user through coordinated gestures and speech. Wireless tracking technologies give the agent awareness of the environment and the user's speech and communicative acts. User and agent can communicate about things in the environment, their placement, and their function, as well as about more abstract topics, such as current news, through situated multimodal dialogue. The results demonstrate the CDM's strength in simplifying the modeling of complex, multifunctional systems that require architectural experimentation and exploration of unclear subsystem boundaries, undefined variables, and tangled data flow and control hierarchies.


Beating Common Sense into Interactive Applications

AI Magazine

A long-standing dream of artificial intelligence has been to put commonsense knowledge into computers -- enabling machines to reason about everyday life. Some projects, such as Cyc, have begun to amass large collections of such knowledge. However, it is widely assumed that the use of common sense in interactive applications will remain impractical for years, until these collections can be considered sufficiently complete and commonsense reasoning sufficiently robust. Recently, at the Massachusetts Institute of Technology's Media Laboratory, we have had some success in applying commonsense knowledge in a number of intelligent interface agents, despite the admittedly spotty coverage and unreliable inference of today's commonsense knowledge systems. This article surveys several of these applications and reflects on interface design principles that enable successful use of commonsense knowledge.


On Prediction Using Variable Order Markov Models

Journal of Artificial Intelligence Research

This paper is concerned with algorithms for prediction of discrete sequences over a finite alphabet, using variable order Markov models. The class of such algorithms is large and in principle includes any lossless compression algorithm. We focus on six prominent prediction algorithms, including Context Tree Weighting (CTW), Prediction by Partial Match (PPM) and Probabilistic Suffix Trees (PSTs). We discuss the properties of these algorithms and compare their performance using real life sequences from three domains: proteins, English text and music pieces. The comparison is made with respect to prediction quality as measured by the average log-loss. We also compare classification algorithms based on these predictors with respect to a number of large protein classification tasks. Our results indicate that a ``decomposed'' CTW (a variant of the CTW algorithm) and PPM outperform all other algorithms in sequence prediction tasks. Somewhat surprisingly, a different algorithm, which is a modification of the Lempel-Ziv compression algorithm, significantly outperforms all algorithms on the protein classification problems.


Say Cheese! Experiences with a Robot Photographer

AI Magazine

We have developed an autonomous robot system that takes well-composed photographs of people at social events, such as weddings and conference receptions. In this article, we outline the overall architecture of the system and describe how the various components interrelate. We also describe our experiences deploying the robot photographer at a number of real-world events.


Say Cheese! Experiences with a Robot Photographer

AI Magazine

This model makes system debugging significantly easier, because we know We introduced a sensor abstraction layer to exactly what each sensor reading is at every separate the task layer from concerns about point in the computation; something that physical sensing devices. We process the sensor would not be the case if we were reading from information (from the laser rangefinder in this the sensors every time a reading was used in a application) into distance measurements from calculation. This model also allows us to inject the center of the robot, thus allowing consideration modified sensor readings into the system, as of sensor error models and performance described in the next section.


AI in the News

AI Magazine

This eclectic keepsake provides a sampling was initially inspired by science fiction, "[iRobot Chairman Helen] Greiner believes'One of what can be found (with links to the full the movie may influence a new generation She said the R2D2 robot's humanlike She went on to the articles were initially available inventions were predicted by those sort of MIT where she earned undergraduate and online and without charge, few things that writers. In terms of the capabilities that graduate degrees in mechanical engineering, good last forever; and (4) the AI in the News we get in modern computers, they could electrical engineering and computer collection--updated, hyperlinked, and see some of that. What I find so interesting science. 'It takes all three (disciplines) and archived--can be found by going to is that we start with these ideas which they must all come together in robotics,' www.aaai.org/aitopics/html/current.html. June 10, "In the war on terror, University about robots programmed to think on Breazeal of the Massachusetts Institute of professor Robin Murphy finds herself a New Jersey.


A Comprehensive Trainable Error Model for Sung Music Queries

Journal of Artificial Intelligence Research

We propose a model for errors in sung queries, a variant of the hidden Markov model (HMM). This is a solution to the problem of identifying the degree of similarity between a (typically error-laden) sung query and a potential target in a database of musical works, an important problem in the field of music information retrieval. Similarity metrics are a critical component of `query-by-humming' (QBH) applications which search audio and multimedia databases for strong matches to oral queries. Our model comprehensively expresses the types of {m error} or variation between target and query: cumulative and non-cumulative local errors, transposition, tempo and tempo changes, insertions, deletions and modulation. The model is not only expressive, but automatically trainable, or able to learn and generalize from query examples. We present results of simulations, designed to assess the discriminatory potential of the model, and tests with real sung queries, to demonstrate relevance to real-world applications.


Real Time Voice Processing with Audiovisual Feedback: Toward Autonomous Agents with Perfect Pitch

Neural Information Processing Systems

We have implemented a real time front end for detecting voiced speech and estimating its fundamental frequency. The front end performs the signal processing for voice-driven agents that attend to the pitch contours of human speech and provide continuous audiovisual feedback. The algorithm we use for pitch tracking has several distinguishing features: it makes no use of FFTs or autocorrelation at the pitch period; it updates the pitch incrementally on a sample-by-sample basis; it avoids peak picking and does not require interpolation in time or frequency to obtain high resolution estimates; and it works reliably over a four octave range, in real time, without the need for postprocessing to produce smooth contours. The algorithm is based on two simple ideas in neural computation: the introduction of a purposeful nonlinearity, and the error signal of a least squares fit.


"Name That Song!" A Probabilistic Approach to Querying on Music and Text

Neural Information Processing Systems

We present a novel, flexible statistical approach for modelling music and text jointly. The approach is based on multi-modal mixture models and maximum a posteriori estimation using EM. The learned models can be used to browse databases with documents containing music and text, to search for music using queries consisting of music and text (lyrics and other contextual information), to annotate text documents with music, and to automatically recommend or identify similar songs.


"Name That Song!" A Probabilistic Approach to Querying on Music and Text

Neural Information Processing Systems

We present a novel, flexible statistical approach for modelling music and text jointly. The approach is based on multi-modal mixture models and maximum a posteriori estimation using EM. The learned models can be used to browse databases with documents containing music and text, to search for music using queries consisting of music and text (lyrics and other contextual information), to annotate text documents with music, and to automatically recommend or identify similar songs.