Goto

Collaborating Authors

 delegation


Ask not what AI can do, but what AI should do: Towards a framework of task delegability

Neural Information Processing Systems

While artificial intelligence (AI) holds promise for addressing societal challenges, issues of exactly which tasks to automate and to what extent to do so remain understudied. We approach this problem of task delegability from a human-centered perspective by developing a framework on human perception of task delegation to AI. We consider four high-level factors that can contribute to a delegation decision: motivation, difficulty, risk, and trust. To obtain an empirical understanding of human preferences in different tasks, we build a dataset of 100 tasks from academic papers, popular media portrayal of AI, and everyday life, and administer a survey based on our proposed framework. We find little preference for full AI control and a strong preference for machine-in-the-loop designs, in which humans play the leading role. Among the four factors, trust is the most correlated with human preferences of optimal human-machine delegation. This framework represents a first step towards characterizing human preferences of AI automation across tasks. We hope this work encourages future efforts towards understanding such individual attitudes; our goal is to inform the public and the AI research community rather than dictating any direction in technology development.


Delegated Classification

Neural Information Processing Systems

When machine learning is outsourced to a rational agent, conflicts of interest might arise and severely impact predictive performance. In this work, we propose a theoretical framework for incentive-aware delegation of machine learning tasks. We model delegation as a principal-agent game, in which accurate learning can be incentivized by the principal using performance-based contracts. Adapting the economic theory of contract design to this setting, we define budget-optimal contracts and prove they take a simple threshold form under reasonable assumptions. In the binary-action case, the optimality of such contracts is shown to be equivalent to the classic Neyman-Pearson lemma, establishing a formal connection between contract design and statistical hypothesis testing. Empirically, we demonstrate that budget-optimal contracts can be constructed using small-scale data, leveraging recent advances in the study of learning curves and scaling laws. Performance and economic outcomes are evaluated using synthetic and real-world classification tasks.





OpenID Connect for Agents (OIDC-A) 1.0: A Standard Extension for LLM-Based Agent Identity and Authorization

Nagabhushanaradhya, Subramanya

arXiv.org Artificial Intelligence

OpenID Connect for Agents (OIDC-A) 1.0 is an extension to OpenID Connect Core 1.0 that provides a comprehensive framework for representing, authenticating, and authorizing LLM-based agents within the OAuth 2.0 ecosystem. As autonomous AI agents become increasingly prevalent in digital systems, there is a critical need for standardized protocols to establish agent identity, verify agent attestation, represent delegation chains, and enable fine-grained authorization based on agent attributes. This specification defines standard claims, endpoints, and protocols that address these requirements while maintaining compatibility with existing OAuth 2.0 and OpenID Connect infrastructure. The proposed framework introduces mechanisms for agent identity representation, delegation chain validation, attestation verification, and capability-based authorization, providing a foundation for secure and trustworthy agent-to-service interactions in modern distributed systems.


AVEC: Bootstrapping Privacy for Local LLMs

Gaikwad, Madhava

arXiv.org Artificial Intelligence

This position paper presents A VEC (Adaptive Verifiable Edge Control), a framework for bootstrapping privacy for local language models by enforcing privacy at the edge with explicit verifiability for delegated queries. A VEC introduces an adaptive budgeting algorithm that allocates per-query differential privacy parameters based on sensitivity, local confidence, and historical usage, and uses verifiable transformation with on-device integrity checks. We formalize guarantees using R enyi differential privacy with odometer-based accounting, and establish utility ceilings, delegation-leakage bounds, and impossibility results for deterministic gating and hash-only certification. Our evaluation is simulation-based by design to study mechanism behavior and accounting; we do not claim deployment readiness or task-level utility with live LLMs. The contribution is a conceptual architecture and theoretical foundation that chart a pathway for empirical follow-up on privately bootstrapping local LLMs.


Ask not what AI can do, but what AI should do: Towards a framework of task delegability

Brian Lubars, Chenhao Tan

Neural Information Processing Systems

While artificial intelligence (AI) holds promise for addressing societal challenges, issues of exactly which tasks to automate and to what extent to do so remain understudied. We approach this problem of task delegability from a human-centered perspective by developing a framework on human perception of task delegation to AI. We consider four high-level factors that can contribute to a delegation decision: motivation, difficulty, risk, and trust. To obtain an empirical understanding of human preferences in different tasks, we build a dataset of 100 tasks from academic papers, popular media portrayal of AI, and everyday life, and administer a survey based on our proposed framework. We find little preference for full AI control and a strong preference for machine-in-the-loop designs, in which humans play the leading role. Among the four factors, trust is the most correlated with human preferences of optimal human-machine delegation. This framework represents a first step towards characterizing human preferences of AI automation across tasks. We hope this work encourages future efforts towards understanding such individual attitudes; our goal is to inform the public and the AI research community rather than dictating any direction in technology development.


Not someone, but something: Rethinking trust in the age of medical AI

Beger, Jan

arXiv.org Artificial Intelligence

As artificial intelligence (AI) becomes embedded in healthcare, trust in medical decision - making is changing fast. Nowhere is this shift more visible than in radiology, where AI tools are increasingly embedded across the imaging workflow -- from scheduling an d acquisition to interpretation, reporting, and communication with referrers and patients. This opinion paper argues that trust in AI isn't a simple transfer from humans to machines -- it's a dynamic, evolving relationship that must be built and maintained. R ather than debating whether AI belongs in medicine, it asks: what kind of trust must AI earn, and how? Drawing from philosophy, bioethics, and system design, it explores the key differences between human trust and machine reliability -- emphasizing transparen cy, accountability, and alignment with the values of good care. It argues that trust in AI shouldn't be built on mimicking empathy or intuition, but on thoughtful design, responsible deployment, and clear moral responsibility. The goal is a balanced view -- o ne that avoids blind optimism and reflexive fear. Trust in AI must be treated not as a given, but as something to be earned over time.


Bot Appétit! Exploring how Robot Morphology Shapes Perceived Affordances via a Mise en Place Scenario in a VR Kitchen

Ringe, Rachel, Thiele, Leandra, Pomarlan, Mihai, Zargham, Nima, Nolte, Robin, Hurrelbrink, Lars, Malaka, Rainer

arXiv.org Artificial Intelligence

This study explores which factors of the visual design of a robot may influence how humans would place it in a collaborative cooking scenario and how these features may influence task delegation. Human participants were placed in a Virtual Reality (VR) environment and asked to set up a kitchen for cooking alongside a robot companion while considering the robot's morphology. We collected multimodal data for the arrangements created by the participants, transcripts of their think-aloud as they were performing the task, and transcripts of their answers to structured post-task questionnaires. Based on analyzing this data, we formulate several hypotheses: humans prefer to collaborate with biomorphic robots; human beliefs about the sensory capabilities of robots are less influenced by the morphology of the robot than beliefs about action capabilities; and humans will implement fewer avoidance strategies when sharing space with gracile robots. We intend to verify these hypotheses in follow-up studies.