Plotting

 Summers-Stay, Douglas


Towards a Holodeck-style Simulation Game

arXiv.org Artificial Intelligence

We introduce Infinitia, a simulation game system that uses generative image and language models at play time to reshape all aspects of the setting and NPCs based on a short description from the player, in a way similar to how settings are created on the fictional Holodeck. Building off the ideas of the Generative Agents paper, our system introduces gameplay elements, such as infinite generated fantasy worlds, controllability of NPC behavior, humorous dialogue, cost & time efficiency, collaboration between players and elements of non-determinism among in-game events. Infinitia is implemented in the Unity engine with a server-client architecture, facilitating the addition of exciting features by community developers in the future. Furthermore, it uses a multiplayer framework to allow humans to be present and interact in the simulation. The simulation will be available in open-alpha shortly at https://infinitia.ai/ and we are looking forward to building upon it with the community.


Representing Sets as Summed Semantic Vectors

arXiv.org Artificial Intelligence

Representing meaning in the form of high dimensional vectors is a common and powerful tool in biologically inspired architectures. While the meaning of a set of concepts can be summarized by taking a (possibly weighted) sum of their associated vectors, this has generally been treated as a one-way operation. In this paper we show how a technique built to aid sparse vector decomposition allows in many cases the exact recovery of the inputs and weights to such a sum, allowing a single vector to represent an entire set of vectors from a dictionary. We characterize the number of vectors that can be recovered under various conditions, and explore several ways such a tool can be used for vector-based reasoning.


A Computational Theory for Life-Long Learning of Semantics

arXiv.org Artificial Intelligence

Semantic vectors are learned from data to express semantic relationships between elements of information, for the purpose of solving and informing downstream tasks. Other models exist that learn to map and classify supervised data. However, the two worlds of learning rarely interact to inform one another dynamically, whether across types of data or levels of semantics, in order to form a unified model. We explore the research problem of learning these vectors and propose a framework for learning the semantics of knowledge incrementally and online, across multiple mediums of data, via binary vectors. We discuss the aspects of this framework to spur future research on this approach and problem.


Turn-Taking in Commander-Robot Navigator Dialog (Video Abstract)

AAAI Conferences

The accompanying video captures the multi-modal data displays and speech dialogue of a human Commander (C) and a human Robot Navigator (RN) tele-operating a mobile robot (R) in a remote, previously unexplored area. We describe unique challenges for automation of turn-taking and coordination processes observed in the data.


Turn-Taking in Commander-Robot Navigator Dialog

AAAI Conferences

We seek to develop a robot that will be capable of teaming with humans to accomplish physical exploration tasks that would not otherwise be possible in dynamic, dangerous environments. For such tasks, a human commander needs to be able to communicate with a robot that moves out of sight and relays information back to the commander. What is the best way to determine how a human commander would interact in a multi-modal spoken dialog with such a robot to accomplish tasks? In this paper, we describe our initial approach to discovering a principled basis for coordinating turn-taking, perception, and navigational behavior of a robot in communication with a commander, by identifying decision phases in dialogs collected in a WoZ framework. We present two types of utterance annotation with examples applied to task-oriented dialog between a human commander and a human ``robot navigator'' who controls the physical robot in a realistic environment similar to expected actual conditions. We discuss core robot capabilities that bear on the robot navigator's ability to take turns while performing a ``find the building doors'' task at hand. The paper concludes with a brief overview of ongoing work to implement these decision phases within an open-source dialog management framework, constructing a task tree specification and dialog control logic for our application domain.