AITopics | Kennington, Casey

Collaborating Authors

Kennington, Casey

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Incremental Dialogue Management: Survey, Discussion, and Implications for HRI

Kennington, Casey, Lison, Pierre, Schlangen, David

arXiv.org Artificial IntelligenceJan-1-2025

Efforts towards endowing robots with the ability to speak have benefited from recent advancements in NLP, in particular large language models. However, as powerful as current models have become, they still operate on sentence or multi-sentence level input, not on the word-by-word input that humans operate on, affecting the degree of responsiveness that they offer, which is critical in situations where humans interact with robots using speech. In this paper, we review the literature on interactive systems that operate incrementally (i.e., at the word level or below it). We motivate the need for incremental systems, survey incremental modeling of important aspects of dialogue like speech recognition and language generation. Primary focus is on the part of the system that makes decisions, known as the dialogue manager. We find that there is very little research on incremental dialogue management, offer some requirements for practical incremental dialogue management, and the implications of incremental dialogue for embodied, robotic platforms.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2501.00953

Country:

Asia (1.00)
North America > United States (0.46)
Europe > Germany (0.28)
Europe > United Kingdom > England (0.14)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)

Add feedback

Renaissance: Investigating the Pretraining of Vision-Language Encoders

Fields, Clayton, Kennington, Casey

arXiv.org Artificial IntelligenceNov-10-2024

In the past several years there has been an explosion of available models for vision-language tasks. Unfortunately, the literature still leaves open a number of questions related to best practices in designing and training such models. In this paper we seek to answer several questions related to the pretraining of vision-language encoders through meta-analysis. In our first set of experiments, we show that we can save significant compute at no cost to downstream performance, by freezing large parts of vision-language models during pretraining. In our second set of experiments we examine the effect of basing a VL transformer on a vision model versus a text model. Additionally, we introduce a VL modeling platform called Renaissance that we use to conduct all of the experiments. This program offers a great deal of flexibility in creating, training and evaluating transformer encoders for VL modeling. The source code for Renaissance can be found at https://github.com/bsu-slim/renaissance.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.06657

Country: North America (0.46)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Unsupervised, Bottom-up Category Discovery for Symbol Grounding with a Curious Robot

Henry, Catherine, Kennington, Casey

arXiv.org Artificial IntelligenceApr-3-2024

Towards addressing the Symbol Grounding Problem and motivated by early childhood language development, we leverage a robot which has been equipped with an approximate model of curiosity with particular focus on bottom-up building of unsupervised categories grounded in the physical world. That is, rather than starting with a top-down symbol (e.g., a word referring to an object) and providing meaning through the application of predetermined samples, the robot autonomously and gradually breaks up its exploration space into a series of increasingly specific unlabeled categories at which point an external expert may optionally provide a symbol association. We extend prior work by using a robot that can observe the visual world, introducing a higher dimensional sensory space, and using a more generalizable method of category building. Our experiments show that the robot learns categories based on actions and what it visually observes, and that those categories can be symbolically grounded into.https://info.arxiv.org/help/prep#comments

artificial intelligence, category, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.03092

Country:

North America > United States > Idaho (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community

Kennington, Casey, Alikhani, Malihe, Pon-Barry, Heather, Atwell, Katherine, Bisk, Yonatan, Fried, Daniel, Gervits, Felix, Han, Zhao, Inan, Mert, Johnston, Michael, Korpan, Raj, Litman, Diane, Marge, Matthew, Matuszek, Cynthia, Mead, Ross, Mohan, Shiwali, Mooney, Raymond, Parde, Natalie, Sinapov, Jivko, Stewart, Angela, Stone, Matthew, Tellex, Stefanie, Williams, Tom

arXiv.org Artificial IntelligenceApr-1-2024

The ability to interact with machines using natural human language is becoming not just commonplace, but expected. The next step is not just text interfaces, but speech interfaces and not just with computers, but with all machines including robots. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots and offer the community three proposals, the first focused on education, the second on benchmarks, and the third on the modeling of language when it comes to spoken interaction with robots. The three proposals should act as white papers for any researcher to take and build upon.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.01158

Country:

North America > United States > New York (0.14)
North America > United States > Maryland (0.14)
North America > United States > Illinois (0.14)

Genre:

Instructional Material > Course Syllabus & Notes (0.68)
Research Report (0.64)

Industry: Education > Curriculum (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Understanding Survey Paper Taxonomy about Large Language Models via Graph Representation Learning

Zhuang, Jun, Kennington, Casey

arXiv.org Artificial IntelligenceFeb-15-2024

As new research on Large Language Models (LLMs) continues, it is difficult to keep up with new research and models. To help researchers synthesize the new research many have written survey papers, but even those have become numerous. In this paper, we develop a method to automatically assign survey papers to a taxonomy. We collect the metadata of 144 LLM survey papers and explore three paradigms to classify papers within the taxonomy. Our work indicates that leveraging graph structure information on co-category graphs can significantly outperform the language models in two paradigms; pre-trained language models' fine-tuning and zero-shot/few-shot classifications using LLMs. We find that our model surpasses an average human recognition level and that fine-tuning LLMs using weak labels generated by a smaller model, such as the GCN in this study, can be more effective than using ground-truth labels, revealing the potential of weak-to-strong generalization in the taxonomy classification task.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2402.10409

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

On the Computational Modeling of Meaning: Embodied Cognition Intertwined with Emotion

Kennington, Casey

arXiv.org Artificial IntelligenceJul-12-2023

How can machines understand language? is a question that many have asked, and represents an important facet of artificial intelligence. Large language models like ChatGPT seem to understand language, but as has been pointed out (Bender and Koller, 2020; Bisk et al., 2020), even large, powerful language models trained on huge amounts of data are likely missing key information to allow them to reach the depth of understanding that humans have. What information are they missing, and, perhaps more importantly, what information do they have that enables them to understand, to the degree that they do? Current computational models of semantic meaning can be broken down into three paradigms: distributional paradigms where meaning is derived from how words are used in text (i.e., the notion that the meaning of a word depends on the "company it keeps," following Firth (1957)) meaningfulness of language lies in the fact that it is about the world (Dahlgren, 1976) and grounded paradigms are where aspects of the physical world are linked to language (i.e., the symbol grounding problem following Harnad (1990)) formal paradigms where meaning is a logical form (e.g., first order logic as in L.T.F.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2307.04518

Country:

Europe > United Kingdom > England (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Vision Language Transformers: A Survey

Fields, Clayton, Kennington, Casey

arXiv.org Artificial IntelligenceJul-6-2023

Vision language tasks, such as answering questions about or generating captions that describe an image, are difficult tasks for computers to perform. A relatively recent body of research has adapted the pretrained transformer architecture introduced in \citet{vaswani2017attention} to vision language modeling. Transformer models have greatly improved performance and versatility over previous vision language models. They do so by pretraining models on a large generic datasets and transferring their learning to new tasks with minor changes in architecture and parameter values. This type of transfer learning has become the standard modeling practice in both natural language processing and computer vision. Vision language transformers offer the promise of producing similar advancements in tasks which require both vision and language. In this paper, we provide a broad synthesis of the currently available research on vision language transformer models and offer some analysis of their strengths, limitations and some open questions that remain.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2307.03254

Country: North America > United States (0.68)

Genre:

Research Report (1.00)
Overview (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Who's in Charge? Roles and Responsibilities of Decision-Making Components in Conversational Robots

Lison, Pierre, Kennington, Casey

arXiv.org Artificial IntelligenceMar-15-2023

Software architectures for conversational robots typically consist of multiple modules, each designed for a particular processing task or functionality. Some of these modules are developed for the purpose of making decisions about the next action that the robot ought to perform in the current context. Those actions may relate to physical movements, such as driving forward or grasping an object, but may also correspond to communicative acts, such as asking a question to the human user. In this position paper, we reflect on the organization of those decision modules in human-robot interaction platforms. We discuss the relative benefits and limitations of modular vs. end-to-end architectures, and argue that, despite the increasing popularity of end-to-end approaches, modular architectures remain preferable when developing conversational robots designed to execute complex tasks in collaboration with human users. We also show that most practical HRI architectures tend to be either robot-centric or dialogue-centric, depending on where developers wish to place the ``command center'' of their system. While those design choices may be justified in some application domains, they also limit the robot's ability to flexibly interleave physical movements and conversational behaviours. We contend that architectures placing ``action managers'' and ``interaction managers'' on an equal footing may provide the best path forward for future human-robot interaction systems.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2303.0847

Country: Europe (0.68)

Genre: Overview (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

Evaluating Automatic Speech Recognition in an Incremental Setting

Whetten, Ryan, Imtiaz, Mir Tahsin, Kennington, Casey

arXiv.org Artificial IntelligenceFeb-23-2023

The increasing reliability of automatic speech recognition has proliferated its everyday use. However, for research purposes, it is often unclear which model one should choose for a task, particularly if there is a requirement for speed as well as accuracy. In this paper, we systematically evaluate six speech recognizers using metrics including word error rate, latency, and the number of updates to already recognized words on English test data, as well as propose and compare two methods for streaming audio into recognizers for incremental recognition. We further propose Revokes per Second as a new metric for evaluating incremental recognition and demonstrate that it provides insights into overall model performance. We find that, generally, local recognizers are faster and require fewer updates than cloud-based recognizers. Finally, we find Meta's Wav2Vec model to be the fastest, and find Mozilla's DeepSpeech model to be the most stable in its predictions.

machine learning, natural language, prediction, (15 more...)

arXiv.org Artificial Intelligence

2302.12049

Country: Europe (0.69)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.68)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

Conversational Agents and Children: Let Children Learn

Kennington, Casey, Fails, Jerry Alan, Wright, Katherine Landau, Pera, Maria Soledad

arXiv.org Artificial IntelligenceFeb-23-2023

Using online information discovery as a case study, in this position paper we discuss the need to design, develop, and deploy (conversational) agents that can -- non-intrusively -- guide children in their quest for online resources rather than simply finding resources for them. We argue that agents should "let children learn" and should be built to take on a teacher-facilitator function, allowing children to develop their technical and critical thinking abilities as they interact with varied technology in a broad range of use cases.

artificial intelligence, chatbot, natural language, (14 more...)

arXiv.org Artificial Intelligence

2302.12043

Country: North America > United States (0.52)

Genre: Research Report (0.40)

Industry: Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.90)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.76)

Add feedback