Andrist, Sean
SIGMA: An Open-Source Interactive System for Mixed-Reality Task Assistance Research
Bohus, Dan, Andrist, Sean, Saw, Nick, Paradiso, Ann, Chakraborty, Ishani, Rad, Mahdi
We introduce an open-source system called SIGMA (short for "Situated Interactive Guidance, Monitoring, and Assistance") as a platform for conducting research on task-assistive agents in mixed-reality scenarios. The system leverages the sensing and rendering affordances of a head-mounted mixed-reality device in conjunction with large language and vision models to guide users step by step through procedural tasks. We present the system's core capabilities, discuss its overall design and implementation, and outline directions for future research enabled by the system. SIGMA is easily extensible and provides a useful basis for future research at the intersection of mixed reality and AI. By open-sourcing an end-to-end implementation, we aim to lower the barrier to entry, accelerate research in this space, and chart a path towards community-driven end-to-end evaluation of large language, vision, and multimodal models in the context of real-world interactive applications.
Platform for Situated Intelligence
Bohus, Dan, Andrist, Sean, Feniello, Ashley, Saw, Nick, Jalobeanu, Mihai, Sweeney, Patrick, Thompson, Anne Loomis, Horvitz, Eric
We introduce Platform for Situated Intelligence, an open-source framework created to support the rapid development and study of multimodal, integrative-AI systems. The framework provides infrastructure for sensing, fusing, and making inferences from temporal streams of data across different modalities, a set of tools that enable visualization and debugging, and an ecosystem of components that encapsulate a variety of perception and processing technologies. These assets jointly provide the means for rapidly constructing and refining multimodal, integrative-AI systems, while retaining the efficiency and performance characteristics required for deployment in open-world settings.
Accelerating the Development of Multimodal, Integrative-AI Systems with Platform for Situated Intelligence
Andrist, Sean, Bohus, Dan
We describe Platform for Situated Intelligence, an open-source framework for multimodal, integrative-AI systems. The framework provides infrastructure, tools, and components that enable and accelerate the development of applications that process multimodal streams of data and in which timing is critical. The framework is particularly well-suited for developing physically situated interactive systems that perceive and reason about their surroundings in order to better interact with people, such as social robots, virtual assistants, smart meeting rooms, etc. In this paper, we provide a brief, high-level overview of the framework and its main affordances, and discuss its implications for HRI.
Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations
Modi, Aditya, Dey, Debadeepta, Agarwal, Alekh, Swaminathan, Adith, Nushi, Besmira, Andrist, Sean, Horvitz, Eric
Assemblies of modular subsystems are being pressed into service to perform sensing, reasoning, and decision making in high-stakes, time-critical tasks in such areas as transportation, healthcare, and industrial automation. We address the opportunity to maximize the utility of an overall computing system by employing reinforcement learning to guide the configuration of the set of interacting modules that comprise the system. The challenge of doing system-wide optimization is a combinatorial problem. Local attempts to boost the performance of a specific module by modifying its configuration often leads to losses in overall utility of the system's performance as the distribution of inputs to downstream modules changes drastically. We present metareasoning techniques which consider a rich representation of the input, monitor the state of the entire pipeline, and adjust the configuration of modules on-the-fly so as to maximize the utility of a system's operation. We show significant improvement in both real-world and synthetic pipelines across a variety of reinforcement learning techniques.
Turn-Taking and Coordination in Human-Machine Interaction
Andrist, Sean (University of Wisconsin-Madison) | Bohus, Dan (Microsoft) | Mutlu, Bilge (University of Wisconsin-Madison) | Schlangen, David (Bielefeld University)
This issue of AI Magazine brings together a collection of articles on challenges, mechanisms, and research progress in turn-taking and coordination between humans and machines. The contributing authors work in interrelated fields of spoken dialog systems, intelligent virtual agents, human-computer interaction, human-robot interaction, and semiautonomous collaborative systems and explore core concepts in coordinating speech and actions with virtual agents, robots, and other autonomous systems. Several of the contributors participated in the AAAI Spring Symposium on Turn-Taking and Coordination in Human-Machine Interaction, held in March 2015, and several articles in this issue are extensions of work presented at that symposium. The articles in the collection address key modeling, methodological, and computational challenges in achieving effective coordination with machines, propose solutions that overcome these challenges under sensory, cognitive, and resource restrictions, and illustrate how such solutions can facilitate coordination across diverse and challenging domains. The contributions highlight turn-taking and coordination in human-machine interaction as an emerging and evolving research area with important implications for future applications of AI.
Reports on the 2015 AAAI Spring Symposium Series
Agarwal, Nitin (University of Arkansas at Little Rock) | Andrist, Sean (University of Wisconsin-Madison) | Bohus, Dan (Microsoft Research) | Fang, Fei (University of Southern California) | Fenstermacher, Laurie (Wright-Patterson Air Force Base) | Kagal, Lalana (Massachusetts Institute of Technology) | Kido, Takashi (Rikengenesis) | Kiekintveld, Christopher (University of Texas at El Paso) | Lawless, W. F. (Paine College) | Liu, Huan (Arizona State University) | McCallum, Andrew (University of Massachusetts) | Purohit, Hemant (Wright State University) | Seneviratne, Oshani (Massachusetts Institute of Technology) | Takadama, Keiki (University of Electro-Communications) | Taylor, Gavin (US Naval Academy)
The AAAI 2015 Spring Symposium Series was held Monday through Wednesday, March 23-25, at Stanford University near Palo Alto, California. The titles of the seven symposia were Ambient Intelligence for Health and Cognitive Enhancement, Applied Computational Game Theory, Foundations of Autonomy and Its (Cyber) Threats: From Individuals to Interdependence, Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches, Logical Formalizations of Commonsense Reasoning, Socio-Technical Behavior Mining: From Data to Decisions, Structured Data for Humanitarian Technologies: Perfect Fit or Overkill?
Reports on the 2015 AAAI Spring Symposium Series
Agarwal, Nitin (University of Arkansas at Little Rock) | Andrist, Sean (University of Wisconsin-Madison) | Bohus, Dan (Microsoft Research) | Fang, Fei (University of Southern California) | Fenstermacher, Laurie (Wright-Patterson Air Force Base) | Kagal, Lalana (Massachusetts Institute of Technology) | Kido, Takashi (Rikengenesis) | Kiekintveld, Christopher (University of Texas at El Paso) | Lawless, W. F. (Paine College) | Liu, Huan (Arizona State University) | McCallum, Andrew (University of Massachusetts) | Purohit, Hemant (Wright State University) | Seneviratne, Oshani (Massachusetts Institute of Technology) | Takadama, Keiki (University of Electro-Communications) | Taylor, Gavin (US Naval Academy)
The AAAI 2015 Spring Symposium Series was held Monday through Wednesday, March 23-25, at Stanford University near Palo Alto, California. The titles of the seven symposia were Ambient Intelligence for Health and Cognitive Enhancement, Applied Computational Game Theory, Foundations of Autonomy and Its (Cyber) Threats: From Individuals to Interdependence, Knowledge Representation and Reasoning: Integrating Symbolic and Neural Approaches, Logical Formalizations of Commonsense Reasoning, Socio-Technical Behavior Mining: From Data to Decisions, Structured Data for Humanitarian Technologies: Perfect Fit or Overkill? and Turn-Taking and Coordination in Human-Machine Interaction.The highlights of each symposium are presented in this report.