Kirk, James R.
Acquiring Grounded Representations of Words with Situated Interactive Instruction
Mohan, Shiwali, Mininger, Aaron H., Kirk, James R., Laird, John E.
We present an approach for acquiring grounded representations of words from mixed-initiative, situated interactions with a human instructor. The work focuses on the acquisition of diverse types of knowledge including perceptual, semantic, and procedural knowledge along with learning grounded meanings. Interactive learning allows the agent to control its learning by requesting instructions about unknown concepts, making learning efficient. Our approach has been instantiated in Soar and has been evaluated on a table-top robotic arm capable of manipulating small objects.
Eliciting Problem Specifications via Large Language Models
Wray, Robert E., Kirk, James R., Laird, John E.
Cognitive systems generally require a human to translate a problem definition into some specification that the cognitive system can use to attempt to solve the problem or perform the task. In this paper, we illustrate that large language models (LLMs) can be utilized to map a problem class, defined in natural language, into a semi-formal specification that can then be utilized by an existing reasoning and learning system to solve instances from the problem class. We present the design of LLM-enabled cognitive task analyst agent(s). Implemented with LLM agents, this system produces a definition of problem spaces for tasks specified in natural language. LLM prompts are derived from the definition of problem spaces in the AI literature and general problem-solving strategies (Polya's How to Solve It). A cognitive system can then use the problem-space specification, applying domain-general problem solving strategies ("weak methods" such as search), to solve multiple instances of problems from the problem class. This result, while preliminary, suggests the potential for speeding cognitive systems research via disintermediation of problem formulation while also retaining core capabilities of cognitive systems, such as robust inference and online learning.
Exploiting Language Models as a Source of Knowledge for Cognitive Agents
Kirk, James R., Wray, Robert E., Laird, John E.
Large language models (LLMs) provide capabilities far beyond sentence completion, including question answering, summarization, and natural-language inference. While many of these capabilities have potential application to cognitive systems, our research is exploiting language models as a source of task knowledge for cognitive agents, that is, agents realized via a cognitive architecture. We identify challenges and opportunities for using language models as an external knowledge source for cognitive systems and possible ways to improve the effectiveness of knowledge extraction by integrating extraction with cognitive architecture capabilities, highlighting with examples from our recent work in this area.
Improving Knowledge Extraction from LLMs for Task Learning through Agent Analysis
Kirk, James R., Wray, Robert E., Lindes, Peter
Large language models (LLMs) offer significant promise as a knowledge source for task learning. Prompt engineering has been shown to be effective for eliciting knowledge from an LLM, but alone it is insufficient for acquiring relevant, situationally grounded knowledge for an embodied agent learning novel tasks. We describe a cognitive-agent approach that extends and complements prompt engineering, mitigating its limitations and thus enabling an agent to acquire new task knowledge matched to its native language capabilities, embodiment, environment, and user preferences. The approach is to increase the response space of LLMs and deploy general strategies, embedded within the autonomous agent, to evaluate, repair, and select among candidate responses produced by the LLM. We describe the approach and experiments that show how an agent, by retrieving and evaluating a breadth of responses from the LLM, can achieve 77-94% task completion in one-shot learning without user oversight. The approach achieves 100% task completion when human oversight (such as an indication of preference) is provided. Further, the type of oversight largely shifts from explicit, natural language instruction to simple confirmation/discomfirmation of high-quality responses that have been vetted by the agent before presentation to a user.
Integrating Diverse Knowledge Sources for Online One-shot Learning of Novel Tasks
Kirk, James R., Wray, Robert E., Lindes, Peter, Laird, John E.
Autonomous agents are able to draw on a wide variety of potential sources of task knowledge; however current approaches invariably focus on only one or two. Here we investigate the challenges and impact of exploiting diverse knowledge sources to learn online, in one-shot, new tasks for a simulated office mobile robot. The resulting agent, developed in the Soar cognitive architecture, uses the following sources of domain and task knowledge: interaction with the environment, task execution and search knowledge, human natural language instruction, and responses retrieved from a large language model (GPT-3). We explore the distinct contributions of these knowledge sources and evaluate the performance of different combinations in terms of learning correct task knowledge and human workload. Results show that an agent's online integration of diverse knowledge sources improves one-shot task learning overall, reducing human feedback needed for rapid and reliable task learning.
Improving Language Model Prompting in Support of Semi-autonomous Task Learning
Kirk, James R., Wray, Robert E., Lindes, Peter, Laird, John E.
Large language models (LLMs) offer a potential source of knowledge for agents that need to acquire new task competencies within a performance environment. We describe efforts toward a novel agent capability that can construct cues (or "prompts") that result in useful LLM responses for an agent learning a new task. Importantly, responses must not only be "reasonable" (a measure used commonly in research on knowledge extraction from LLMs) but also must be specific to the agent's task context and in a form that the agent can interpret given its native language capacities. We summarize a series of empirical investigations of agent prompting strategies and evaluate LLM responses against the goals of targeted and actionable responses for task learning. Our results demonstrate that actionable task knowledge can be obtained from LLMs in support of online agent task learning.