counterfactual thinking
Reflection-Bench: probing AI intelligence with reflection
Li, Lingyu, Wang, Yixu, Zhao, Haiquan, Kong, Shuqi, Teng, Yan, Li, Chunbo, Wang, Yingchun
The ability to adapt beliefs or behaviors in response to unexpected outcomes, reflection, is fundamental to intelligent systems' interaction with the world. From a cognitive science perspective, this serves as a core principle of intelligence applicable to both human and AI systems. To address the debate on the intelligence of large language models (LLMs), we propose Reflection-Bench, a comprehensive benchmark comprising 7 tasks spanning Figure 1: Reflection, a fundamental process of intelligence, core cognitive functions crucial for reflection, integrates various cognitive components. To including perception, memory, belief updating, achieve desired outcomes, an intelligent agent must decision-making, prediction, counterfactual predict the external world states and behavioral consequences thinking, and meta-reflection. We evaluate based on prior beliefs. Post-action, discrepancies the performances of 13 prominent LLMs between prediction and observation are perceived, such as OpenAI o1, GPT-4, Claude 3.5 Sonnet, prompting an update of prior belief.
- North America > United States > Wisconsin (0.05)
- North America > United States > Iowa (0.05)
- Asia > China > Shanghai > Shanghai (0.04)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.88)
Evaluating the Impact of Pulse Oximetry Bias in Machine Learning under Counterfactual Thinking
Martins, Inês, Matos, João, Gonçalves, Tiago, Celi, Leo A., Wong, A. Ian, Cardoso, Jaime S.
Algorithmic bias in healthcare mirrors existing data biases. However, the factors driving unfairness are not always known. Medical devices capture significant amounts of data but are prone to errors; for instance, pulse oximeters overestimate the arterial oxygen saturation of darker-skinned individuals, leading to worse outcomes. The impact of this bias in machine learning (ML) models remains unclear. This study addresses the technical challenges of quantifying the impact of medical device bias in downstream ML. Our experiments compare a "perfect world", without pulse oximetry bias, using SaO2 (blood-gas), to the "actual world", with biased measurements, using SpO2 (pulse oximetry). Under this counterfactual design, two models are trained with identical data, features, and settings, except for the method of measuring oxygen saturation: models using SaO2 are a "control" and models using SpO2 a "treatment". The blood-gas oximetry linked dataset was a suitable test-bed, containing 163,396 nearly-simultaneous SpO2 - SaO2 paired measurements, aligned with a wide array of clinical features and outcomes. We studied three classification tasks: in-hospital mortality, respiratory SOFA score in the next 24 hours, and SOFA score increase by two points. Models using SaO2 instead of SpO2 generally showed better performance. Patients with overestimation of O2 by pulse oximetry of > 3% had significant decreases in mortality prediction recall, from 0.63 to 0.59, P < 0.001. This mirrors clinical processes where biased pulse oximetry readings provide clinicians with false reassurance of patients' oxygen levels. A similar degradation happened in ML models, with pulse oximetry biases leading to more false negatives in predicting adverse outcomes.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
3 Exercises to Boost Your Team's Creativity
Almost every business, of every size, across sectors, employs creativity training, from whiteboard brainstorming sessions to design thinking. It's a billion-dollar industry, and with good reason: Creativity is the main engine of innovation and entrepreneurship, and a major driver of resilience. Instead, it perpetuates expert bias and pseudo-innovation, and although it can temporarily boost morale, it does little over the long haul to reduce burnout. On the whole, research has shown it to be at best inadequate and at worst counterproductive. To understand what's broken, and how to fix it, my lab partnered with teams at a variety of organizations, among them Silicon Valley startups, U.S. Special Operations, the University of Chicago Booth School of Business, and Fortune 50 companies.
- North America > United States > Illinois > Cook County > Chicago (0.25)
- North America > United States > California (0.25)
- North America > United States > New York (0.05)
- Asia > Middle East > Jordan (0.05)
Counterfactual thinking in cooperation dynamics
Pereira, Luis Moniz, Santos, Francisco C.
Counterfactual Thinking is a human cognitive ability studied in a wide variety of domains. It captures the process of reasoning about a past event that did not occur, namely what would have happened had this event occurred, or, otherwise, to reason about an event that did occur but what would ensue had it not. Given the wide cognitive empowerment of counterfactual reasoning in the human individual, the question arises of how the presence of individuals with this capability may improve cooperation in populations of self-regarding individuals. Here we propose a mathematical model, grounded on Evolutionary Game Theory, to examine the population dynamics emerging from the interplay between counterfactual thinking and social learning (i.e., individuals that learn from the actions and success of others) whenever the individuals in the population face a collective dilemma. Our results suggest that counterfactual reasoning fosters coordination in collective action problems occurring in large populations, and has a limited impact on cooperation dilemmas in which coordination is not required. Moreover, we show that a small prevalence of individuals resorting to counterfactual thinking is enough to nudge an entire population towards highly cooperative standards.
- Europe > Portugal > Lisbon > Lisbon (0.04)
- North America > United States (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Leisure & Entertainment > Games (0.88)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)