AITopics | kahneman

Collaborating Authors

kahneman

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Capturing Large Language Human Cognitive Biases

Neural Information Processing SystemsFeb-8-2026, 19:45:00 GMT

Theproblemcontainsa prompt (blue), acanonicalsolutiontotheprompt (green), andafewtest-cases (black).

large language model, machine learning, simulation of human behavior, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.50)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Emergent Cognitive Convergence via Implementation: A Structured Loop Reflecting Four Theories of Mind

Kim, Myung Ho

arXiv.org Artificial IntelligenceNov-13-2025

We report a structural convergence among four influential theories of mind: Kahneman's dual-system theory, Friston's predictive processing, Minsky's society of mind, and Clark's extended mind, emerging unintentionally within a practical AI architecture known as Agentic Flow. Designed to address the limitations of large language models (LLMs), Agentic Flow comprises five interlocking modules: Retrieval, Cognition, Control, Memory, and Action, organized into a repeatable cognitive loop. Although originally inspired only by Minsky and Clark, subsequent analysis revealed that its structure echoes computational motifs from all four theories, suggesting that theoretical convergence can emerge naturally from implementation demands rather than deliberate synthesis. Controlled evaluations confirmed this: the structured agent achieved 95.8% task success versus 62.3% for baseline LLMs, demonstrating robust constraint adherence and reproducible reasoning. We describe this convergence under a broader descriptive meta-architecture called PEACE, highlighting recurring design patterns such as predictive modeling, associative recall, and error-sensitive control. Later formalized as the Structured Cognitive Loop (SCL), this framework generalizes the same principles as a foundation for behavioral intelligence in LLM-based agents. Rather than claiming theoretical unification, this paper proposes that intelligent architectures may evolve toward shared structural patterns shaped by practical constraints. As a position paper, it aims to frame this convergence as an interpretive reflection rather than a finalized theory, inviting further theoretical and experimental dialogue. Agentic Flow, or equivalently the Structured Cognitive Loop, thus offers a glimpse of how a unified cognitive form can arise not from abstraction, but from the necessities of real-world reasoning.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2507.16184

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (0.71)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

The Bias is in the Details: An Assessment of Cognitive Bias in LLMs

Knipper, R. Alexander, Knipper, Charles S., Zhang, Kaiqi, Sims, Valerie, Bowers, Clint, Karmaker, Santu

arXiv.org Artificial IntelligenceSep-30-2025

As Large Language Models (LLMs) are increasingly embedded in real-world decision-making processes, it becomes crucial to examine the extent to which they exhibit cognitive biases. Extensively studied in the field of psychology, cognitive biases appear as systematic distortions commonly observed in human judgments. This paper presents a large-scale evaluation of eight well-established cognitive biases across 45 LLMs, analyzing over 2.8 million LLM responses generated through controlled prompt variations. To achieve this, we introduce a novel evaluation framework based on multiple-choice tasks, hand-curate a dataset of 220 decision scenarios targeting fundamental cognitive biases in collaboration with psychologists, and propose a scalable approach for generating diverse prompts from human-authored scenario templates. Our analysis shows that LLMs exhibit bias-consistent behavior in 17.8-57.3% of instances across a range of judgment and decision-making contexts targeting anchoring, availability, confirmation, framing, interpretation, overattribution, prospect theory, and representativeness biases. We find that both model size and prompt specificity play a significant role on bias susceptibility as follows: larger size (>32B parameters) can reduce bias in 39.5% of cases, while higher prompt detail reduces most biases by up to 14.9%, except in one case (Overattribution), which is exacerbated by up to 8.8%.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.22856

Country:

North America > United States (1.00)
Asia (1.00)
Europe (0.67)

Genre: Research Report (0.81)

Industry:

Health & Medicine > Therapeutic Area (0.68)
Transportation > Air (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Cognitive Decision Routing in Large Language Models: When to Think Fast, When to Think Slow

Du, Y., Guo, C., Wang, W., Tang, G.

arXiv.org Artificial IntelligenceAug-26-2025

Large Language Models (LLMs) face a fundamental challenge in deciding when to rely on rapid, intuitive responses versus engaging in slower, more deliberate reasoning. Inspired by Daniel Kahneman's dual-process theory and his insights on human cognitive biases, we propose a novel Cognitive Decision Routing (CDR) framework that dynamically determines the appropriate reasoning strategy based on query characteristics. Our approach addresses the current limitations where models either apply uniform reasoning depth or rely on computationally expensive methods for all queries. We introduce a meta-cognitive layer that analyzes query complexity through multiple dimensions: correlation strength between given information and required conclusions, domain boundary crossings, stakeholder multiplicity, and uncertainty levels. Through extensive experiments on diverse reasoning tasks, we demonstrate that CDR achieves superior performance while reducing computational costs by 34\% compared to uniform deep reasoning approaches. Our framework shows particular strength in professional judgment tasks, achieving 23\% improvement in consistency and 18\% better accuracy on expert-level evaluations. This work bridges cognitive science principles with practical AI system design, offering a principled approach to adaptive reasoning in LLMs.

artificial intelligence, large language model, natural language, (13 more...)

arXiv.org Artificial Intelligence

2508.16636

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Model Outputs def sum(x, y): " " "

Neural Information Processing SystemsAug-14-2025, 17:14:12 GMT

Using code generation as a case study, we find that OpenAI's

arxiv preprint arxiv, codex, language model, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
Africa (0.04)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

EICAP: Deep Dive in Assessment and Enhancement of Large Language Models in Emotional Intelligence through Multi-Turn Conversations

Nazar, Nizi, Asgari, Ehsaneddin

arXiv.org Artificial IntelligenceAug-11-2025

Emotional Intelligence (EI) is a critical yet underexplored dimension in the development of human-aligned LLMs. To address this gap, we introduce a unified, psychologically grounded four-layer taxonomy of EI tailored for large language models (LLMs), encompassing emotional tracking, cause inference, appraisal, and emotionally appropriate response generation. Building on this framework, we present EICAP-Bench, a novel MCQ style multi-turn benchmark designed to evaluate EI capabilities in open-source LLMs across diverse linguistic and cultural contexts. We evaluate six LLMs: LLaMA3 (8B), LLaMA3-Instruct, Gemma (9B), Gemma-Instruct, Qwen2.5 (7B), and Qwen2.5-Instruct on EmoCap-Bench, identifying Qwen2.5-Instruct as the strongest baseline. To assess the potential for enhancing EI capabilities, we fine-tune both Qwen2.5-Base and Qwen2.5-Instruct using LoRA adapters on UltraChat (UC), a large-scale, instruction-tuned dialogue dataset, in both English and Arabic. Our statistical analysis reveals that among the five EI layers, only the Appraisal layer shows significant improvement through UC-based fine-tuning. These findings highlight the limitations of existing pretraining and instruction-tuning paradigms in equipping LLMs with deeper emotional reasoning and underscore the need for targeted data and modeling strategies for comprehensive EI alignment.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2508.06196

Country: Europe > United Kingdom > England (0.14)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Learning, Reasoning, Refinement: A Framework for Kahneman's Dual-System Intelligence in GUI Agents

Wei, Jinjie, Liu, Jiyao, Liu, Lihao, Hu, Ming, Ning, Junzhi, Li, Mingcheng, Yin, Weijie, He, Junjun, Liang, Xiao, Feng, Chao, Yang, Dingkang

arXiv.org Artificial IntelligenceJun-24-2025

Graphical User Interface (GUI) agents have made significant progress in automating digital tasks through the utilization of computer vision and language models. Nevertheless, existing agent systems encounter notable limitations. Firstly, they predominantly depend on trial and error decision making rather than progressive reasoning, thereby lacking the capability to learn and adapt from interactive encounters. Secondly, these systems are assessed using overly simplistic single step accuracy metrics, which do not adequately reflect the intricate nature of real world GUI interactions. In this paper, we present CogniGUI, a cognitive framework developed to overcome these limitations by enabling adaptive learning for GUI automation resembling human-like behavior. Inspired by Kahneman's Dual Process Theory, our approach combines two main components: (1) an omni parser engine that conducts immediate hierarchical parsing of GUI elements through quick visual semantic analysis to identify actionable components, and (2) a Group based Relative Policy Optimization (GRPO) grounding agent that assesses multiple interaction paths using a unique relative reward system, promoting minimal and efficient operational routes. This dual-system design facilitates iterative ''exploration learning mastery'' cycles, enabling the agent to enhance its strategies over time based on accumulated experience. Moreover, to assess the generalization and adaptability of agent systems, we introduce ScreenSeek, a comprehensive benchmark that includes multi application navigation, dynamic state transitions, and cross interface coherence, which are often overlooked challenges in current benchmarks. Experimental results demonstrate that CogniGUI surpasses state-of-the-art methods in both the current GUI grounding benchmarks and our newly proposed benchmark.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.17913

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(3 more...)

Add feedback

An Empirical Study of the Anchoring Effect in LLMs: Existence, Mechanism, and Potential Mitigations

Huang, Yiming, Bie, Biquan, Na, Zuqiu, Ruan, Weilin, Lei, Songxin, Yue, Yutao, He, Xinlei

arXiv.org Artificial IntelligenceMay-22-2025

The rise of Large Language Models (LLMs) like ChatGPT has advanced natural language processing, yet concerns about cognitive biases are growing. In this paper, we investigate the anchoring effect, a cognitive bias where the mind relies heavily on the first information as anchors to make affected judgments. We explore whether LLMs are affected by anchoring, the underlying mechanisms, and potential mitigation strategies. To facilitate studies at scale on the anchoring effect, we introduce a new dataset, SynAnchors. Combining refined evaluation metrics, we benchmark current widely used LLMs. Our findings show that LLMs' anchoring bias exists commonly with shallow-layer acting and is not eliminated by conventional strategies, while reasoning can offer some mitigation. This recontextualization via cognitive psychology urges that LLM evaluations focus not on standard benchmarks or over-optimized robustness tests, but on cognitive-bias-aware trustworthy evaluation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.15392

Country:

Europe (0.28)
Asia > China (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Identifying Non-Replicable Social Science Studies with Language Models

Saynova, Denitsa, Hansson, Kajsa, Bruinsma, Bastiaan, Fredén, Annika, Johansson, Moa

arXiv.org Artificial IntelligenceMar-10-2025

In this study, we investigate whether LLMs can be used to indicate if a study in the behavioural social sciences is replicable. Using a dataset of 14 previously replicated studies (9 successful, 5 unsuccessful), we evaluate the ability of both open-source (Llama 3 8B, Qwen 2 7B, Mistral 7B) and proprietary (GPT-4o) instruction-tuned LLMs to discriminate between replicable and non-replicable findings. We use LLMs to generate synthetic samples of responses from behavioural studies and estimate whether the measured effects support the original findings. When compared with human replication results for these studies, we achieve F1 values of up to $77\%$ with Mistral 7B, $67\%$ with GPT-4o and Llama 3 8B, and $55\%$ with Qwen 2 7B, suggesting their potential for this task. We also analyse how effect size calculations are affected by sampling temperature and find that low variance (due to temperature) leads to biased effect estimates.

llama 0, mistral 0, qwen 0, (15 more...)

arXiv.org Artificial Intelligence

2503.10671

Country:

Asia > Middle East > Jordan (0.04)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.04)
Europe > Netherlands > Drenthe > Assen (0.04)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners

Jiang, Bowen, Xie, Yangxinyu, Hao, Zhuoqun, Wang, Xiaomeng, Mallick, Tanwi, Su, Weijie J., Taylor, Camillo J., Roth, Dan

arXiv.org Artificial IntelligenceJun-16-2024

This study introduces a hypothesis-testing framework to assess whether large language models (LLMs) possess genuine reasoning abilities or primarily depend on token bias. We go beyond evaluating LLMs on accuracy; rather, we aim to investigate their token bias in solving logical reasoning tasks. Specifically, we develop carefully controlled synthetic datasets, featuring conjunction fallacy and syllogistic problems. Our framework outlines a list of hypotheses where token biases are readily identifiable, with all null hypotheses assuming genuine reasoning capabilities of LLMs. The findings in this study suggest, with statistical guarantee, that most LLMs still struggle with logical reasoning. While they may perform well on classic problems, their success largely depends on recognizing superficial patterns with strong token bias, thereby raising concerns about their actual reasoning and generalization abilities.

experimental result, hypothesis, llm, (15 more...)

arXiv.org Artificial Intelligence

2406.1105

Country:

Europe > United Kingdom > England > Greater London > London > Wimbledon (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > New York (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.49)

Industry:

Media > News (0.68)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)
Government > Voting & Elections (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback