Cognitive Architectures
Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task
How humans make repeated choices among options with imperfectly known reward outcomes is an important problem in psychology and neuroscience. This is often studied using multi-armed bandits, which is also frequently studied in machine learning. We present data from a human stationary bandit experiment, in which we vary the average abundance and variability of reward availability (mean and variance of the reward rate distribution). Surprisingly, we find subjects significantly underestimate prior mean of reward rates - based on their self-report on their reward expectation of non-chosen arms at the end of a game. Previously, human learning in the bandit task was found to be well captured by a Bayesian ideal learning model, the Dynamic Belief Model (DBM), albeit under an incorrect generative assumption of the temporal structure - humans assume reward rates can change over time even though they are truly fixed. We find that the "pessimism bias" in the bandit task is well captured by the prior mean of DBM when fitted to human choices; but it is poorly captured by the prior mean of the Fixed Belief Model (FBM), an alternative Bayesian model that (correctly) assumes reward rates to be constants. This pessimism bias is also incompletely captured by a simple reinforcement learning model (RL) commonly used in neuroscience and psychology, in terms of fitted initial Q-values. While it seems sub-optimal, and thus mysterious, that humans have an underestimated prior reward expectation, our simulations show that an underestimated prior mean helps to maximize long-term gain, if the observer assumes volatility when reward rates are stable, and utilizes a softmax decision policy instead of the optimal one (obtainable by dynamic programming). This raises the intriguing possibility that the brain underestimates reward rates to compensate for the incorrect non-stationarity assumption in the generative model and a simplified decision policy.
Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task
How humans make repeated choices among options with imperfectly known reward outcomes is an important problem in psychology and neuroscience. This is often studied using multi-armed bandits, which is also frequently studied in machine learning. We present data from a human stationary bandit experiment, in which we vary the average abundance and variability of reward availability (mean and variance of the reward rate distribution). Surprisingly, we find subjects significantly underestimate prior mean of reward rates - based on their self-report on their reward expectation of non-chosen arms at the end of a game. Previously, human learning in the bandit task was found to be well captured by a Bayesian ideal learning model, the Dynamic Belief Model (DBM), albeit under an incorrect generative assumption of the temporal structure - humans assume reward rates can change over time even though they are truly fixed. We find that the "pessimism bias" in the bandit task is well captured by the prior mean of DBM when fitted to human choices; but it is poorly captured by the prior mean of the Fixed Belief Model (FBM), an alternative Bayesian model that (correctly) assumes reward rates to be constants. This pessimism bias is also incompletely captured by a simple reinforcement learning model (RL) commonly used in neuroscience and psychology, in terms of fitted initial Q-values. While it seems sub-optimal, and thus mysterious, that humans have an underestimated prior reward expectation, our simulations show that an underestimated prior mean helps to maximize long-term gain, if the observer assumes volatility when reward rates are stable, and utilizes a softmax decision policy instead of the optimal one (obtainable by dynamic programming). This raises the intriguing possibility that the brain underestimates reward rates to compensate for the incorrect non-stationarity assumption in the generative model and a simplified decision policy.
major comments below. Thanks to Reviewer # 1 and # 4 for pointing out that behavioral work in cognitive science suggests that people indeed
Thank you all for your helpful comments on our Comp Neuro paper. If the results of Figure 1 are indicative, this could further improve the results. The supervised training phase is depicted in the somewhat busy Fig. S2. While we disagree with Reviewer #2's opinion that the connection between neural regression and GPs is completely
A Machine With Human-Like Memory Systems
Kim, Taewoon, Cochez, Michael, Francois-Lavet, Vincent, Neerincx, Mark, Vossen, Piek
Inspired by the cognitive science theory, we explicitly model an agent with both semantic and episodic memory systems, and show that it is better than having just one of the two memory systems. In order to show this, we have designed and released our own challenging environment, "the Room", compatible with OpenAI Gym, where an agent has to properly learn how to encode, store, and retrieve memories to maximize its rewards. The Room environment allows for a hybrid intelligence setup where machines and humans can collaborate. We show that two agents collaborating with each other results in better performance than one agent acting alone.
AIhub monthly digest: July 2024 – attending RoboCup, real-world simulators, and AI and cognitive science
Welcome to our monthly digest, where you can catch up with any AIhub stories you may have missed, peruse the latest news, recap recent events, and more. This month, we take a trip to RoboCup2024, see what the International Conference on Machine Learning had in store, and learn about interactive real-world simulators. RoboCup is an international scientific initiative with the goal of advancing the state of the art of intelligent robots. As part of this initiative, a series of competitions and meetings are held throughout the year. The showcase event is an international affair with teams travelling from far and wide to put their machines through their paces.
Qualitative Event Perception: Leveraging Spatiotemporal Episodic Memory for Learning Combat in a Strategy Game
Hancock, Will, Forbus, Kenneth D.
Event perception refers to people's ability to carve up continuous experience into meaningful discrete events. We speak of finishing our morning coffee, mowing the lawn, leaving work, etc. as singular occurrences that are localized in time and space. In this work, we analyze how spatiotemporal representations can be used to automatically segment continuous experience into structured episodes, and how these descriptions can be used for analogical learning. These representations are based on Hayes' notion of histories and build upon existing work on qualitative episodic memory. Our agent automatically generates event descriptions of military battles in a strategy game and improves its gameplay by learning from this experience. Episodes are segmented based on changing properties in the world and we show evidence that they facilitate learning because they capture event descriptions at a useful spatiotemporal grain size. This is evaluated through our agent's performance in the game. We also show empirical evidence that the perception of spatial extent of episodes affects both their temporal duration as well as the number of overall cases generated.
Knowledge Management in the Companion Cognitive Architecture
Nakos, Constantine, Forbus, Kenneth D.
One of the fundamental aspects of cognitive architectures is their ability to encode and manipulate knowledge. Without a consistent, well-designed, and scalable knowledge management scheme, an architecture will be unable to move past toy problems and tackle the broader problems of cognition. In this paper, we document some of the challenges we have faced in developing the knowledge stack for the Companion cognitive architecture and discuss the tools, representations, and practices we have developed to overcome them. We also lay out a series of potential next steps that will allow Companion agents to play a greater role in managing their own knowledge. It is our hope that these observations will prove useful to other cognitive architecture developers facing similar challenges.
Interview with Yuan Yang: working at the intersection of AI and cognitive science
In this interview series, we're meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In this latest interview, we hear from Yuan Yang, who completed his PhD in May. This autumn, Yuan will be joining the College of Information, Mechanical and Electrical Engineering, Shanghai Normal University as an associate professor. From August 2018 to May 2024, I did my PhD in the computer science department at Vanderbilt University, which is located in the famous music city – Nashville, Tennessee.
Metacognitive AI: Framework and the Case for a Neurosymbolic Approach
Wei, Hua, Shakarian, Paulo, Lebiere, Christian, Draper, Bruce, Krishnaswamy, Nikhil, Nirenburg, Sergei
Metacognition is the concept of reasoning about an agent's own internal processes and was originally introduced in the field of developmental psychology. In this position paper, we examine the concept of applying metacognition to artificial intelligence. We introduce a framework for understanding metacognitive artificial intelligence (AI) that we call TRAP: transparency, reasoning, adaptation, and perception. We discuss each of these aspects in-turn and explore how neurosymbolic AI (NSAI) can be leveraged to address challenges of metacognition.
From Manifestations to Cognitive Architectures: a Scalable Framework
Ibias, Alfredo, Ramirez-Miranda, Guillem, Guinovart, Enric, Alarcon, Eduard
The Artificial Intelligence field is flooded with optimisation methods. In this paper, we change the focus to developing modelling methods with the aim of getting us closer to Artificial General Intelligence. To do so, we propose a novel way to interpret reality as an information source, that is later translated into a computational framework able to capture and represent such information. This framework is able to build elements of classical cognitive architectures, like Long Term Memory and Working Memory, starting from a simple primitive that only processes Spatial Distributed Representations. Moreover, it achieves such level of verticality in a seamless scalable hierarchical way.