Goto

Collaborating Authors

 cognitively


Simulating Society Requires Simulating Thought

arXiv.org Artificial Intelligence

Simulating society with large language models (LLMs), we argue, requires more than generating plausible behavior; it demands cognitively grounded reasoning that is structured, revisable, and traceable. LLM-based agents are increasingly used to emulate individual and group behavior, primarily through prompting and supervised fine-tuning. Yet current simulations remain grounded in a behaviorist "demographics in, behavior out" paradigm, focusing on surface-level plausibility. As a result, they often lack internal coherence, causal reasoning, and belief traceability, making them unreliable for modeling how people reason, deliberate, and respond to interventions. To address this, we present a conceptual modeling paradigm, Generative Minds (GenMinds), which draws from cognitive science to support structured belief representations in generative agents. To evaluate such agents, we introduce the RECAP (REconstructing CAusal Paths) framework, a benchmark designed to assess reasoning fidelity via causal traceability, demographic grounding, and intervention consistency. These contributions advance a broader shift: from surface-level mimicry to generative agents that simulate thought, not just language, for social simulations.


Speech-Based Cognitive Screening: A Systematic Evaluation of LLM Adaptation Strategies

arXiv.org Artificial Intelligence

Over half of US adults with Alzheimer disease and related dementias remain undiagnosed, and speech-based screening offers a scalable detection approach. We compared large language model adaptation strategies for dementia detection using the DementiaBank speech corpus, evaluating nine text-only models and three multimodal audio-text models on recordings from DementiaBank speech corpus. Adaptations included in-context learning with different demonstration selection policies, reasoning-augmented prompting, parameter-efficient fine-tuning, and multimodal integration. Results showed that class-centroid demonstrations achieved the highest in-context learning performance, reasoning improved smaller models, and token-level fine-tuning generally produced the best scores. Adding a classification head substantially improved underperforming models. Among multimodal models, fine-tuned audio-text systems performed well but did not surpass the top text-only models. These findings highlight that model adaptation strategies, including demonstration selection, reasoning design, and tuning method, critically influence speech-based dementia detection, and that properly adapted open-weight models can match or exceed commercial systems.


Cognitive Chain-of-Thought: Structured Multimodal Reasoning about Social Situations

arXiv.org Artificial Intelligence

Chain-of-Thought (CoT) prompting helps models think step by step. But what happens when they must see, understand, and judge-all at once? In visual tasks grounded in social context, where bridging perception with norm-grounded judgments is essential, flat CoT often breaks down. We introduce Cognitive Chain-of-Thought (CoCoT), a prompting strategy that scaffolds VLM reasoning through three cognitively inspired stages: perception, situation, and norm. Our experiments show that, across multiple multimodal benchmarks (including intent disambiguation, commonsense reasoning, and safety), CoCoT consistently outperforms CoT and direct prompting (+8\% on average). Our findings demonstrate that cognitively grounded reasoning stages enhance interpretability and social awareness in VLMs, paving the way for safer and more reliable multimodal systems.


TACTIC: Translation Agents with Cognitive-Theoretic Interactive Collaboration

arXiv.org Artificial Intelligence

Machine translation has long been a central task in natural language processing. With the rapid advancement of large language models (LLMs), there has been remarkable progress in translation quality. However, fully realizing the translation potential of LLMs remains an open challenge. Recent studies have explored multi-agent systems to decompose complex translation tasks into collaborative subtasks, showing initial promise in enhancing translation quality through agent cooperation and specialization. Nevertheless, existing multi-agent translation frameworks largely neglect foundational insights from cognitive translation studies. These insights emphasize how human translators employ different cognitive strategies, such as balancing literal and free translation, refining expressions based on context, and iteratively evaluating outputs. To address this limitation, we propose a cognitively informed multi-agent framework called TACTIC, which stands for T ranslation A gents with Cognitive- T heoretic Interactive Collaboration. The framework comprises six functionally distinct agents that mirror key cognitive processes observed in human translation behavior. These include agents for drafting, refinement, evaluation, scoring, context reasoning, and external knowledge gathering. By simulating an interactive and theory-grounded translation workflow, TACTIC effectively leverages the full capacity of LLMs for high-quality translation. Experimental results on diverse language pairs from the FLORES-200 and WMT24 benchmarks show that our method consistently achieves state-of-the-art performance. Using DeepSeek-V3 as the base model, TACTIC surpasses GPT-4.1 by an average of +0.6 XCOMET and +1.18 COMETKIWI-23. Compared to DeepSeek-R1, it further improves by +0.84 XCOMET and +2.99 COMETKIWI-23. Code is available at https://github.com/weiyali126/TACTIC.


Automated Extraction of Spatio-Semantic Graphs for Identifying Cognitive Impairment

arXiv.org Artificial Intelligence

Existing methods for analyzing linguistic content from picture descriptions for assessment of cognitive-linguistic impairment often overlook the participant's visual narrative path, which typically requires eye tracking to assess. Spatio-semantic graphs are a useful tool for analyzing this narrative path from transcripts alone, however they are limited by the need for manual tagging of content information units (CIUs). In this paper, we propose an automated approach for estimation of spatio-semantic graphs (via automated extraction of CIUs) from the Cookie Theft picture commonly used in cognitive-linguistic analyses. The method enables the automatic characterization of the visual semantic path during picture description. Experiments demonstrate that the automatic spatio-semantic graphs effectively differentiate between cognitively impaired and unimpaired speakers. Statistical analyses reveal that the features derived by the automated method produce comparable results to the manual method, with even greater group differences between clinical groups of interest. These results highlight the potential of the automated approach for extracting spatio-semantic features in developing clinical speech models for cognitive impairment assessment.


Cognitively Inspired Components for Social Conversational Agents

arXiv.org Artificial Intelligence

Current conversational agents (CA) have seen improvement in conversational quality in recent years due to the influence of large language models (LLMs) like GPT3. However, two key categories of problem remain. Firstly there are the unique technical problems resulting from the approach taken in creating the CA, such as scope with retrieval agents and the often nonsensical answers of former generative agents. Secondly, humans perceive CAs as social actors, and as a result expect the CA to adhere to social convention. Failure on the part of the CA in this respect can lead to a poor interaction and even the perception of threat by the user. As such, this paper presents a survey highlighting a potential solution to both categories of problem through the introduction of cognitively inspired additions to the CA. Through computational facsimiles of semantic and episodic memory, emotion, working memory, and the ability to learn, it is possible to address both the technical and social problems encountered by CAs.


Cognitive Design for Artificial Minds

#artificialintelligence

Cognitive Design for Artificial Minds explains the crucial role that human cognition research plays in the design and realization of artificial intelligence systems, illustrating the steps necessary for the design of artificial models of cognition. It bridges the gap between the theoretical, experimental, and technological issues addressed in the context of AI of cognitive inspiration and computational cognitive science. Beginning with an overview of the historical, methodological, and technical issues in the field of cognitively inspired artificial intelligence, Lieto illustrates how the cognitive design approach has an important role to play in the development of intelligent AI technologies and plausible computational models of cognition. Introducing a unique perspective that draws upon Cybernetics and early AI principles, Lieto emphasizes the need for an equivalence between cognitive processes and implemented AI procedures, in order to realize biologically and cognitively inspired artificial minds. He also introduces the Minimal Cognitive Grid, a pragmatic method to rank the different degrees of biological and cognitive accuracy of artificial systems in order to project and predict their explanatory power with respect to the natural systems taken as a source of inspiration. Providing a comprehensive overview of cognitive design principles in constructing artificial minds, this text will be essential reading for students and researchers of artificial intelligence and cognitive science.


Five minute AI test could diagnose Alzheimer's up to 15 years early

#artificialintelligence

The NHS has introduced a revolutionary new app to help diagnose Alzheimer's Disease. It takes only five minutes to complete and is more accurate than established pen-and-paper tests. The test is currently done on iPads at a general practice or hospital ward but it could soon be conducted at home on a smart phone โ€“ paving the way for the nation's first widespread screening programme for Alzheimer's and other forms of dementia within the next few years. It is hoped it will identify people at high-risk of developing the disease up to 15 years before symptoms appear, so that steps can be taken to slow its progression. The test uses artificial intelligence to assess a person's brain function by showing them large numbers of black and white photographs and asking them to identify which ones contain an animal.


Adolescents with autism may engage neural control systems differently, study finds: Researchers used brain scans to measure proactive and reactive executive control

#artificialintelligence

Executive control difficulties are common in individuals with autism and are associated with challenges completing tasks and managing time. The study, published in Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, sought to tease out whether these difficulties represent a disruption in proactive executive control (engaged and maintained before a cognitively demanding event) or in reactive executive control (engaged as the event occurs). Using functional magnetic resonance imaging (fMRI), the researchers took brain scans of 141 adolescents and young adults ages 12-22 (64 with autism, 77 neurotypical controls) enrolled in the Cognitive Control in Autism Study. During the scan, the participants completed a task that required them to adapt their behavior. They were shown a green or red cue, followed by a white arrow (probe) pointing left or right.


Cognitively Aided Zero-Shot Automatic Essay Grading

arXiv.org Artificial Intelligence

Automatic essay grading (AEG) is a process in which machines assign a grade to an essay written in response to a topic, called the prompt. Zero-shot AEG is when we train a system to grade essays written to a new prompt which was not present in our training data. In this paper, we describe a solution to the problem of zero-shot automatic essay grading, using cognitive information, in the form of gaze behaviour. Our experiments show that using gaze behaviour helps in improving the performance of AEG systems, especially when we provide a new essay written in response to a new prompt for scoring, by an average of almost 5 percentage points of QWK.