Information Technology
An Asynchronous Hidden Markov Model for Audio-Visual Speech Recognition
They are very well suited to handle discrete of continuous sequences of varying sizes. Moreover, an efficient training algorithm (EM) is available, as well as an efficient decoding algorithm (Viterbi), which provides the optimal sequence of states (and the corresponding sequence of high level events) associated with a given sequence of low-level data. On the other hand, multimodal information processing is currently a very challenging framework of applications including multimodal person authentication, multimodal speech recognition, multimodal event analyzers, etc. In that framework, the same sequence of events is represented not only by a single sequence of data but by a series of sequences of data, each of them coming eventually from a different modality: video streams with various viewpoints, audio stream(s), etc. One such task, which will be presented in this paper, is multimodal speech recognition using both a microphone and a camera recording a speaker simultaneously while he (she) speaks.
Application of Variational Bayesian Approach to Speech Recognition
Watanabe, Shinji, Minami, Yasuhiro, Nakamura, Atsushi, Ueda, Naonori
Application of V ariational Bayesian Approach to Speech Recognition Shinji Watanabe, Y asuhiro Minami, Atsushi Nakamura and Naonori Ueda NTT Communication Science Laboratories, NTT Corporation 2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan {watanabe,minami,ats,ueda}@cslab.kecl.ntt.co.jp Abstract In this paper, we propose a Bayesian framework, which constructs shared-state triphone HMMs based on a variational Bayesian approach, and recognizes speech based on the Bayesian prediction classification; variational Bayesian estimation and clustering for speech recognition (VBEC). An appropriate model structure with high recognition performance can be found within a VBEC framework. Unlike conventional methods, including BIC or MDL criterion based on the maximum likelihood approach, the proposed model selection is valid in principle, even when there are insufficient amounts of data, because it does not use an asymptotic assumption. In acoustic modeling, a triphone-based hidden Markov model (triphone HMM) has been widely employed. The triphone is a context dependent phoneme unit that considers both the preceding and following phonemes.
IJCAI-03 Conference Highlights
This summer's AI conference in Acapulco offered attendees wide variety of program choices as well as ample time to catch up with friends and colleagues. For many, scheduling time was probably the biggest challenge because the conference included numerous invited speakers, 189 technical paper presentations, 93 posters, a Mobile Robot Competition, 19 Innovative Applications of AI (IAAI) award-winning paper presentations, a Trading Agents Competition, a special track on AI and the web, and the vendor exhibit.
Qualitative Spatial Reasoning Extracting and Reasoning with Spatial Aggregates
Bailey-Kellogg, Chris, Zhao, Feng
Reasoning about spatial data is a key task in many applications, including geographic information systems, meteorological and fluid-flow analysis, computer-aided design, and protein structure databases. Such applications often require the identifi- cation and manipulation of qualitative spatial representations, for example, to detect whether one object will soon occlude another in a digital image or efficiently determine relationships between a proposed road and wetland regions in a geographic data set. Qualitative spatial reasoning (QSR) provides representational primitives (a spatial "vocabulary") and inference mechanisms for these tasks. This article first reviews representative work on QSR for data-poor scenarios, where the goal is to design representations that can answer qualitative queries without much numeric information. It then turns to the data-rich case, where the goal is to derive and manipulate qualitative spatial representations that efficiently and correctly abstract important spatial aspects of the underlying data for use in subsequent tasks. This article focuses on how a particular QSR system, SPATIAL AGGREGATION, can help answer spatial queries for scientific and engineering data sets. A case study application of weather analysis illustrates the effective representation and reasoning supported by both data-poor and data-rich forms of QSR
The Process Specification Language (PSL) Theory and Applications
Gruninger, Michael, Menzel, Christopher
The PROCESS SPECIFICATION language (PSL) has been designed to facilitate correct and complete exchange of process information among manufacturing systems, such as scheduling, process modeling, process planning, production planning, simulation, project management, work flow, and business-process reengineering. We give an overview of the theories within the PSL ontology, discuss some of the design principles for the ontology, and finish with examples of process specifications that are based on the ontology.
Ontologies for Corporate Web Applications
Obrst, Leo, Liu, Howard, Wray, Robert
In this article, we discuss some issues that arise when ontologies are used to support corporate application domains such as electronic commerce (ecommerce) and some technical problems in deploying ontologies for real-world use. In particular, we focus on issues of ontology integration and the related problem of semantic mapping, that is, the mapping of ontologies and taxonomies to reference ontologies to preserve semantics. Along the way, we discuss what typically constitutes an ontology architecture. By its very nature, B2B e-commerce must try to interlink buyers and sellers from multiple companies with disparate product-description terminologies and meanings, thus serving as a paradigmatic case for the use of ontologies to support corporate applications.
2003 AAAI Spring Symposium Series
Abecker, Andreas, Antonsson, Erik K., Callaway, Charles B., Dignum, Virginia, Doherty, Patrick, Elst, Ludger van, Freed, Michael, Freedman, Reva, Guesgen, Hans, Jones, Gareth, Koza, John, Kortenkamp, David, Maybury, Mark, McCarthy, John, Mitra, Debasis, Renz, Jochen, Schreckenghost, Debra, Williams, Mary-Anne
The Association for the Advancement of Artificial Intelligence, in cooperation with Stanford University's Department of Computer Science, presented the 2003 Spring Symposium Series, Monday through Wednesday, 24-26 March 2003, at Stanford University. The titles of the eight symposia were Agent-Mediated Knowledge Management, Computational Synthesis: From Basic Building Blocks to High- Level Functions, Foundations and Applications of Spatiotemporal Reasoning (FASTR), Human Interaction with Autonomous Systems in Complex Environments, Intelligent Multimedia Knowledge Management, Logical Formalization of Commonsense Reasoning, Natural Language Generation in Spoken and Written Dialogue, and New Directions in Question-Answering Motivation.
Ontologies for Corporate Web Applications
Obrst, Leo, Liu, Howard, Wray, Robert
In this article, we discuss some issues that arise when ontologies are used to support corporate application domains such as electronic commerce (ecommerce) and some technical problems in deploying ontologies for real-world use. In particular, we focus on issues of ontology integration and the related problem of semantic mapping, that is, the mapping of ontologies and taxonomies to reference ontologies to preserve semantics. Along the way, we discuss what typically constitutes an ontology architecture. We situate the discussion in the domain of business-to-business (B2B) e-commerce. By its very nature, B2B e-commerce must try to interlink buyers and sellers from multiple companies with disparate product-description terminologies and meanings, thus serving as a paradigmatic case for the use of ontologies to support corporate applications.
A Framework for the Development of Personalized, Distributed Web-Based Configuration Systems
Ardissono, Liliana, Felfernig, Alexander, Friedrich, Gerhard, Goy, Anna, Jannach, Dietmar, Petrone, Giovanna, Schafer, Ralph, Zanker, Markus
For the last two decades, configuration systems relying on AI techniques have successfully been applied in industrial environments. These systems support the configuration of complex products and services in shorter time with fewer errors and, therefore, reduce the costs of a mass-customization business model. The European Union-funded project entitled CUSTOMER-ADAPTIVE WEB INTERFACE FOR THE CONFIGURATION OF PRODUCTS AND SERVICES WITH MULTIPLE SUPPLIERS (CAWICOMS) aims at the next generation of web-based configuration applications that cope with two challenges of today's open, networked economy: (1) the support for heterogeneous user groups in an open-market environment and (2) the integration of configurable subproducts provided by specialized suppliers. This article describes the CAWICOMS WORKBENCH for the development of configuration services, offering personalized user interaction as well as distributed configuration of products and services in a supply chain. The developed tools and techniques rely on a harmonized knowledge representation and knowledge-acquisition mechanism, open XMLbased protocols, and advanced personalization and distributed reasoning techniques. We exploited the workbench based on the real-world business scenario of distributed configuration of services in the domain of information processing-based virtual private networks.
GRACE: An Autonomous Robot for the AAAI Robot Challenge
Simmons, Reid, Goldberg, Dani, Goode, Adam, Montemerlo, Michael, Roy, Nicholas, Sellner, Brennan, Urmson, Chris, Schultz, Alan, Abramson, Myriam, Adams, William, Atrash, Amin, Bugajska, Magda, Coblenz, Michael, MacMahon, Matt, Perzanowski, Dennis, Horswill, Ian, Zubek, Robert, Kortenkamp, David, Wolfe, Bryn, Milam, Tod, Maxwell, Bruce
In an attempt to solve as much of the AAAI Robot Challenge as possible, five research institutions representing academia, industry, and government integrated their research into a single robot named GRACE. This article describes this first-year effort by the GRACE team, including not only the various techniques each participant brought to GRACE but also the difficult integration effort itself.