AITopics | decision task

Collaborating Authors

decision task

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Language Models Discriminate Against Speakers of German Dialects

Bui, Minh Duc, Holtermann, Carolin, Hofmann, Valentin, Lauscher, Anne, von der Wense, Katharina

arXiv.org Artificial IntelligenceSep-18-2025

Dialects represent a significant component of human culture and are found across all regions of the world. In Germany, more than 40% of the population speaks a regional dialect (Adler and Hansen, 2022). However, despite cultural importance, individuals speaking dialects often face negative societal stereotypes. We examine whether such stereotypes are mirrored by large language models (LLMs). We draw on the sociolinguistic literature on dialect perception to analyze traits commonly associated with dialect speakers. Based on these traits, we assess the dialect naming bias and dialect usage bias expressed by LLMs in two tasks: an association task and a decision task. To assess a model's dialect usage bias, we construct a novel evaluation corpus that pairs sentences from seven regional German dialects (e.g., Alemannic and Bavarian) with their standard German counterparts. We find that: (1) in the association task, all evaluated LLMs exhibit significant dialect naming and dialect usage bias against German dialect speakers, reflected in negative adjective associations; (2) all models reproduce these dialect naming and dialect usage biases in their decision making; and (3) contrary to prior work showing minimal bias with explicit demographic mentions, we find that explicitly labeling linguistic demographics--German dialect speakers--amplifies bias more than implicit cues like dialect usage.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.13835

Country:

Europe > Germany (1.00)
North America > United States > Minnesota (0.28)
Asia > Middle East > UAE (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)
Overview (0.92)

Industry:

Government (1.00)
Education (1.00)
Health & Medicine > Therapeutic Area (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Discrimination by LLMs: Cross-lingual Bias Assessment and Mitigation in Decision-Making and Summarisation

Huijzer, Willem, Chen, Jieying

arXiv.org Artificial IntelligenceSep-15-2025

The rapid integration of Large Language Models (LLMs) into various domains raises concerns about societal inequalities and information bias. This study examines biases in LLMs related to background, gender, and age, with a focus on their impact on decision-making and summarization tasks. Additionally, the research examines the cross-lingual propagation of these biases and evaluates the effectiveness of prompt-instructed mitigation strategies. Using an adapted version of the dataset by Tamkin et al. (2023) translated into Dutch, we created 151,200 unique prompts for the decision task and 176,400 for the summarisation task. Various demographic variables, instructions, salience levels, and languages were tested on GPT-3.5 and GPT-4o. Our analysis revealed that both models were significantly biased during decision-making, favouring female gender, younger ages, and certain backgrounds such as the African-American background. In contrast, the summarisation task showed minimal evidence of bias, though significant age-related differences emerged for GPT-3.5 in English. Cross-lingual analysis showed that bias patterns were broadly similar between English and Dutch, though notable differences were observed across specific demographic categories. The newly proposed mitigation instructions, while unable to eliminate biases completely, demonstrated potential in reducing them. The most effective instruction achieved a 27\% mean reduction in the gap between the most and least favorable demographics. Notably, contrary to GPT-3.5, GPT-4o displayed reduced biases for all prompts in English, indicating the specific potential for prompt-based mitigation within newer models. This research underscores the importance of cautious adoption of LLMs and context-specific bias testing, highlighting the need for continued development of effective mitigation strategies to ensure responsible deployment of AI.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.09735

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Government (1.00)
Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explanations are a means to an end

Hullman, Jessica, Guo, Ziyang, Ustun, Berk

arXiv.org Machine LearningJul-1-2025

Modern methods for explainable machine learning are designed to describe how models map inputs to outputs--without deep consideration of how these explanations will be used in practice. This paper argues that explanations should be designed and evaluated with a specific end in mind. We describe how to formalize this end in a framework based in statistical decision theory. We show how this functionally-grounded approach can be applied across diverse use cases, such as clinical decision support, providing recourse, or debugging. We demonstrate its use to characterize the maximum "boost" in performance on a particular task that an explanation could provide an idealized decision-maker, preventing misuse due to ambiguity by forcing researchers to specify concrete use cases that can be analyzed in light of models of expected explanation use. We argue that evaluation should meld theoretical and empirical perspectives on the value of explanation, and contribute definitions that span these perspectives.

decision support system, explanation, machine learning, (15 more...)

arXiv.org Machine Learning

2506.2274

Country:

North America > United States > Iowa > Story County > Ames (0.04)
North America > United States > Minnesota (0.04)
Europe > Austria > Vienna (0.04)
Asia > Malaysia (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.93)
Banking & Finance > Real Estate (0.68)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Add feedback

Learning to Represent Individual Differences for Choice Decision Making

Chen, Yan-Ying, Weng, Yue, Filipowicz, Alexandre, Iliev, Rumen, Chen, Francine, Hakimi, Shabnam, Zhang, Yanxia, Lee, Matthew, Lyons, Kent, Wu, Charlene

arXiv.org Artificial IntelligenceMar-27-2025

Human decision making can be challenging to predict because decisions are affected by a number of complex factors. Adding to this complexity, decision-making processes can differ considerably between individuals, and methods aimed at predicting human decisions need to take individual differences into account. Behavioral science offers methods by which to measure individual differences (e.g., questionnaires, behavioral models), but these are often narrowed down to low dimensions and not tailored to specific prediction tasks. This paper investigates the use of representation learning to measure individual differences from behavioral experiment data. Representation learning offers a flexible approach to create individual embeddings from data that are both structured (e.g., demographic information) and unstructured (e.g., free text), where the flexibility provides more options for individual difference measures for personalization, e.g., free text responses may allow for open-ended questions that are less privacy-sensitive. In the current paper we use representation learning to characterize individual differences in human performance on an economic decision-making task. We demonstrate that models using representation learning to capture individual differences consistently improve decision predictions over models without representation learning, and even outperform well-known theory-based behavioral models used in these environments. Our results propose that representation learning offers a useful and flexible tool to capture individual differences.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.21704

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Los Altos (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Small Language Models Also Work With Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby Llamas

Bunzeck, Bastian, Duran, Daniel, Schade, Leonie, Zarrieß, Sina

arXiv.org Artificial IntelligenceJan-3-2025

Recent work investigates whether LMs learn human-like linguistic generalizations and representations from developmentally plausible amounts of data. Yet, the basic linguistic units processed in these LMs are determined by subword-based tokenization, which limits their validity as models of learning at and below the word level. In this paper, we explore the potential of tokenization-free, phoneme- and grapheme-based language models. We demonstrate that small models based on the Llama architecture can achieve strong linguistic performance on standard syntactic and novel lexical/phonetic benchmarks when trained with character-level vocabularies. We further show that phoneme-based models almost match grapheme-based models in standard tasks and novel evaluations. Our findings suggest a promising direction for creating more linguistically plausible language models that are better suited for computational studies of language acquisition and processing.

computational linguistic, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2410.01487

Country:

North America > United States (0.93)
Europe (0.68)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robust agents learn causal world models

Richens, Jonathan, Everitt, Tom

arXiv.org Artificial IntelligenceApr-15-2024

It has long been hypothesised that causal reasoning plays a fundamental role in robust and general intelligence. However, it is not known if agents must learn causal models in order to generalise to new domains, or if other inductive biases are sufficient. We answer this question, showing that any agent capable of satisfying a regret bound under a large set of distributional shifts must have learned an approximate causal model of the data generating process, which converges to the true causal model for optimal agents. We discuss the implications of this result for several research areas including transfer learning and causal inference.

agent, causal model, intervention, (15 more...)

arXiv.org Artificial Intelligence

2402.10877

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Linguistic Calibration of Language Models

Band, Neil, Li, Xuechen, Ma, Tengyu, Hashimoto, Tatsunori

arXiv.org Machine LearningMar-30-2024

Language models (LMs) may lead their users to make suboptimal downstream decisions when they confidently hallucinate. This issue can be mitigated by having the LM verbally convey the probability that its claims are correct, but existing models cannot produce text with calibrated confidence statements. Through the lens of decision-making, we formalize linguistic calibration for long-form generations: an LM is linguistically calibrated if its generations enable its users to make calibrated probabilistic predictions. This definition enables a training framework where a supervised finetuning step bootstraps an LM to emit long-form generations with confidence statements such as "I estimate a 30% chance of..." or "I am certain that...", followed by a reinforcement learning step which rewards generations that enable a user to provide calibrated answers to related questions. We linguistically calibrate Llama 2 7B and find in automated and human evaluations of long-form generations that it is significantly more calibrated than strong finetuned factuality baselines with comparable accuracy. These findings generalize under distribution shift on question-answering and under a significant task shift to person biography generation. Our results demonstrate that long-form generations may be calibrated end-to-end by constructing an objective in the space of the predictions that users make in downstream decision-making.

calibration, forecaster, long-form generation, (15 more...)

arXiv.org Machine Learning

2404.00474

Country:

South America > Colombia (0.04)
Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
(12 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine (0.93)
Education (0.67)
Leisure & Entertainment > Sports > Motorsports (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)

Add feedback

A Statistical Framework for Measuring AI Reliance

Guo, Ziyang, Wu, Yifan, Hartline, Jason, Hullman, Jessica

arXiv.org Artificial IntelligenceJan-27-2024

Humans frequently make decisions with the aid of artificially intelligent (AI) systems. A common pattern is for the AI to recommend an action to the human who retains control over the final decision. Researchers have identified ensuring that a human has appropriate reliance on an AI as a critical component of achieving complementary performance. We argue that the current definition of appropriate reliance used in such research lacks formal statistical grounding and can lead to contradictions. We propose a formal definition of reliance, based on statistical decision theory, which separates the concepts of reliance as the probability the decision-maker follows the AI's prediction from challenges a human may face in differentiating the signals and forming accurate beliefs about the situation. Our definition gives rise to a framework that can be used to guide the design and interpretation of studies on human-AI complementarity and reliance. Using recent AI-advised decision making studies from literature, we demonstrate how our framework can be used to separate the loss due to mis-reliance from the loss due to not accurately differentiating the signals. We evaluate these losses by comparing to a baseline and a benchmark for complementary performance defined by the expected payoff achieved by a rational agent facing the same decision task as the behavioral agents.

ai prediction, benchmark, prediction, (15 more...)

arXiv.org Artificial Intelligence

2401.15356

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Malaysia (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

Add feedback

Decision Theoretic Foundations for Experiments Evaluating Human Decisions

Hullman, Jessica, Kale, Alex, Hartline, Jason

arXiv.org Artificial IntelligenceJan-25-2024

Decision-making with information displays is a key focus of research in areas like explainable AI, human-AI teaming, and data visualization. However, what constitutes a decision problem, and what is required for an experiment to be capable of concluding that human decisions are flawed in some way, remain open to speculation. We present a widely applicable definition of a decision problem synthesized from statistical decision theory and information economics. We argue that to attribute loss in human performance to forms of bias, an experiment must provide participants with the information that a rational agent would need to identify the normative decision. We evaluate the extent to which recent evaluations of decision-making from the literature on AI-assisted decisions achieve this criteria. We find that only 6 (17\%) of 35 studies that claim to identify biased behavior present participants with sufficient information to characterize their behavior as deviating from good decision-making. We motivate the value of studying well-defined decision problems by describing a characterization of performance losses they allow us to conceive. In contrast, the ambiguities of a poorly communicated decision problem preclude normative interpretation. We conclude with recommendations for practice.

decision problem, information, participant, (14 more...)

arXiv.org Artificial Intelligence

2401.15106

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Illinois > Cook County > Evanston (0.04)
Asia > Malaysia (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.46)
Government > Voting & Elections (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(4 more...)

Add feedback

Exploring Conversational Agents as an Effective Tool for Measuring Cognitive Biases in Decision-Making

Pilli, Stephen

arXiv.org Artificial IntelligenceJan-8-2024

Heuristics and cognitive biases are an integral part of human decision-making. Automatically detecting a particular cognitive bias could enable intelligent tools to provide better decision-support. Detecting the presence of a cognitive bias currently requires a hand-crafted experiment and human interpretation. Our research aims to explore conversational agents as an effective tool to measure various cognitive biases in different domains. Our proposed conversational agent incorporates a bias measurement mechanism that is informed by the existing experimental designs and various experimental tasks identified in the literature. Our initial experiments to measure framing and loss-aversion biases indicate that the conversational agents can be effectively used to measure the biases.

cognitive bias, decision task, decision-making, (12 more...)

arXiv.org Artificial Intelligence

2401.06686

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Hawaii (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education > Assessment & Standards > Measuring Intelligence (1.00)
Consumer Products & Services > Travel (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (1.00)

Add feedback