Goto

Collaborating Authors

 pustejovsky


Computational Thought Experiments for a More Rigorous Philosophy and Science of the Mind

arXiv.org Artificial Intelligence

We offer philosophical motivations for a method we call Virtual World Cognitive Science (VW CogSci), in which researchers use virtual embodied agents that are embedded in virtual worlds to explore questions in the field of Cognitive Science. We focus on questions about mental and linguistic representation and the ways that such computational modeling can add rigor to philosophical thought experiments, as well as the terminology used in the scientific study of such representations. We find that this method forces researchers to take a god's-eye view when describing dynamical relationships between entities in minds and entities in an environment in a way that eliminates the need for problematic talk of belief and concept types, such as the belief that cats are silly, and the concept CAT, while preserving belief and concept tokens in individual cognizers' minds. We conclude with some further key advantages of VW CogSci for the scientific study of mental and linguistic representation and for Cognitive Science more broadly.


Exploring Affordance and Situated Meaning in Image Captions: A Multimodal Analysis

arXiv.org Artificial Intelligence

This paper explores the grounding issue regarding multimodal semantic representation from a computational cognitive-linguistic view. We annotate images from the Flickr30k dataset with five perceptual properties: Affordance, Perceptual Salience, Object Number, Gaze Cueing, and Ecological Niche Association (ENA), and examine their association with textual elements in the image captions. Our findings reveal that images with Gibsonian affordance show a higher frequency of captions containing 'holding-verbs' and 'container-nouns' compared to images displaying telic affordance. Perceptual Salience, Object Number, and ENA are also associated with the choice of linguistic expressions. Our study demonstrates that comprehensive understanding of objects or events requires cognitive attention, semantic nuances in language, and integration across multiple modalities. We highlight the vital importance of situated meaning and affordance grounding in natural language understanding, with the potential to advance human-like interpretation in various scenarios.


Dense Paraphrasing for Textual Enrichment

arXiv.org Artificial Intelligence

Understanding inferences and answering questions from text requires more than merely recovering surface arguments, adjuncts, or strings associated with the query terms. As humans, we interpret sentences as contextualized components of a narrative or discourse, by both filling in missing information, and reasoning about event consequences. In this paper, we define the process of rewriting a textual expression (lexeme or phrase) such that it reduces ambiguity while also making explicit the underlying semantics that is not (necessarily) expressed in the economy of sentence structure as Dense Paraphrasing (DP). We build the first complete DP dataset, provide the scope and design of the annotation task, and present results demonstrating how this DP process can enrich a source text to improve inferencing and QA task performance. The data and the source code will be publicly available.


Extracting Temporal Event Relation with Syntactic-Guided Temporal Graph Transformer

arXiv.org Artificial Intelligence

Extracting temporal relations (e.g., before, after, concurrent) among events is crucial to natural language understanding. Previous studies mainly rely on neural networks to learn effective features or manual-crafted linguistic features for temporal relation extraction, which usually fail when the context between two events is complex or wide. Inspired by the examination of available temporal relation annotations and human-like cognitive procedures, we propose a new Temporal Graph Transformer network to (1) explicitly find the connection between two events from a syntactic graph constructed from one or two continuous sentences, and (2) automatically locate the most indicative temporal cues from the path of the two event mentions as well as their surrounding concepts in the syntactic graph with a new temporal-oriented attention mechanism. Experiments on MATRES and TB-Dense datasets show that our approach significantly outperforms previous state-of-the-art methods on both end-to-end temporal relation extraction and temporal relation classification.


Hallmarks of Human-Machine Collaboration: A framework for assessment in the DARPA Communicating with Computers Program

arXiv.org Artificial Intelligence

There is a growing desire to create computer systems that can communicate effectively to collaborate with humans on complex, open-ended activities. Assessing these systems presents significant challenges. We describe a framework for evaluating systems engaged in open-ended complex scenarios where evaluators do not have the luxury of comparing performance to a single right answer. This framework has been used to evaluate human-machine creative collaborations across story and music generation, interactive block building, and exploration of molecular mechanisms in cancer. These activities are fundamentally different from the more constrained tasks performed by most contemporary personal assistants as they are generally open-ended, with no single correct solution, and often no obvious completion criteria. We identified the Key Properties that must be exhibited by successful systems. From there we identified "Hallmarks" of success -- capabilities and features that evaluators can observe that would be indicative of progress toward achieving a Key Property. In addition to being a framework for assessment, the Key Properties and Hallmarks are intended to serve as goals in guiding research direction.


Neurosymbolic AI for Situated Language Understanding

arXiv.org Artificial Intelligence

In recent years, data-intensive AI, particularly the domain of natural language processing and understanding, has seen significant progress driven by the advent of large datasets and deep neural networks that have sidelined more classic AI approaches to the field. These systems can apparently demonstrate sophisticated linguistic understanding or generation capabilities, but often fail to transfer their skills to situations they have not encountered before. We argue that computational situated grounding provides a solution to some of these learning challenges by creating situational representations that both serve as a formal model of the salient phenomena, and contain rich amounts of exploitable, task-appropriate data for training new, flexible computational models. Our model reincorporates some ideas of classic AI into a framework of neurosymbolic intelligence, using multimodal contextual modeling of interactive situations, events, and object properties. We discuss how situated grounding provides diverse data and multiple levels of modeling for a variety of AI learning challenges, including learning how to interact with object affordances, learning semantics for novel structures and configurations, and transferring such learned knowledge to new objects and situations.


Q&A: Artificial intelligence and the classroom of the future

#artificialintelligence

Imagine a classroom in the future where teachers are working alongside artificial intelligence partners to ensure no student gets left behind. The AI partner's careful monitoring picks up on a student in the back who has been quiet and still for the whole class and the AI partner prompts the teacher to engage the student. When called on, the student asks a question. The teacher clarifies the material that has been presented and every student comes away with a better understanding of the lesson. This is part of a larger vision of future classrooms where human instruction and AI technology interact to improve educational environments and the learning experience.


Artificial intelligence and the classroom of the future

#artificialintelligence

Imagine a classroom in the future where teachers are working alongside artificial intelligence partners to ensure no student gets left behind. The AI partner’s careful monitoring picks up on a student in the back who has been quiet and still for the whole class and the AI partner prompts the teacher to engage the student. When called on, the student asks a question. The teacher clarifies the material that has been presented and every student comes away with a better understanding of the lesson. This is part of a larger vision of future classrooms where human instruction and AI technology interact to improve educational environments and the learning experience. James Pustejovsky, the TJX Feldberg Professor of Computer Science, is working towards that vision with a team led by the University of Colorado Boulder, as part of the new $20 million National Science Foundation-funded AI Institute for Student-AI Teaming. The research will play a critical role in helping ensure the AI agent is a natural partner in the classroom, with language and vision capabilities, allowing it to not only hear what the teacher and each student is saying, but also notice gestures (pointing, shrugs, shaking a head), eye gaze, and facial expressions (student attitudes and emotions).


A Formal Analysis of Multimodal Referring Strategies Under Common Ground

arXiv.org Artificial Intelligence

In this paper, we present an analysis of computationally generated mixed-modality definite referring expressions using combinations of gesture and linguistic descriptions. In doing so, we expose some striking formal semantic properties of the interactions between gesture and language, conditioned on the introduction of content into the common ground between the (computational) speaker and (human) viewer, and demonstrate how these formal features can contribute to training better models to predict viewer judgment of referring expressions, and potentially to the generation of more natural and informative referring expressions.


A Survey on Temporal Reasoning for Temporal Information Extraction from Text

Journal of Artificial Intelligence Research

Time is deeply woven into how people perceive, and communicate about the world. Almost unconsciously, we provide our language utterances with temporal cues, like verb tenses, and we can hardly produce sentences without such cues. Extracting temporal cues from text, and constructing a global temporal view about the order of described events is a major challenge of automatic natural language understanding. Temporal reasoning, the process of combining different temporal cues into a coherent temporal view, plays a central role in temporal information extraction. This article presents a comprehensive survey of the research from the past decades on temporal reasoning for automatic temporal information extraction from text, providing a case study on how combining symbolic reasoning with machine learning-based information extraction systems can improve performance. It gives a clear overview of the used methodologies for temporal reasoning, and explains how temporal reasoning can be, and has been successfully integrated into temporal information extraction systems. Based on the distillation of existing work, this survey also suggests currently unexplored research areas. We argue that the level of temporal reasoning that current systems use is still incomplete for the full task of temporal information extraction, and that a deeper understanding of how the various types of temporal information can be integrated into temporal reasoning is required to drive future research in this area.