AITopics | principal-agent problem

Collaborating Authors

principal-agent problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Contracting with a Learning Agent Guru Guruganesh Yoav Kolumbus Jon Schneider Inbal T algam-Cohen Emmanouil-V asileios Vlatakis-Gkaragkounis Joshua R. Wang S. Matthew Weinberg

Neural Information Processing SystemsFeb-16-2026, 13:36:34 GMT

This shift causes the agent to "free-fall" through their

contract, data mining, machine learning, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.92)

Industry:

Law (0.67)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.45)

Add feedback

8d7c8a3a0ed04006d129b3cebcac7a3e-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 09:08:53 GMT

agent, contract, dynamic contract, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.92)

Industry:

Law (0.67)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.67)
Information Technology > Data Science > Data Mining > Big Data (0.45)

Add feedback

Regret Minimization for Piecewise Linear Rewards: Contracts, Auctions, and Beyond

Bacchiocchi, Francesco, Castiglioni, Matteo, Marchesi, Alberto, Gatti, Nicola

arXiv.org Artificial IntelligenceMar-3-2025

Most microeconomic models of interest involve optimizing a piecewise linear function. These include contract design in hidden-action principal-agent problems, selling an item in posted-price auctions, and bidding in first-price auctions. When the relevant model parameters are unknown and determined by some (unknown) probability distributions, the problem becomes learning how to optimize an unknown and stochastic piecewise linear reward function. Such a problem is usually framed within an online learning framework, where the decision-maker (learner) seeks to minimize the regret of not knowing an optimal decision in hindsight. This paper introduces a general online learning framework that offers a unified approach to tackle regret minimization for piecewise linear rewards, under a suitable monotonicity assumption commonly satisfied by microeconomic models. We design a learning algorithm that attains a regret of $\widetilde{O}(\sqrt{nT})$, where $n$ is the number of ``pieces'' of the reward function and $T$ is the number of rounds. This result is tight when $n$ is \emph{small} relative to $T$, specifically when $n \leq T^{1/3}$. Our algorithm solves two open problems in the literature on learning in microeconomic settings. First, it shows that the $\widetilde{O}(T^{2/3})$ regret bound obtained by Zhu et al. [Zhu+23] for learning optimal linear contracts in hidden-action principal-agent problems is not tight when the number of agent's actions is small relative to $T$. Second, our algorithm demonstrates that, in the problem of learning to set prices in posted-price auctions, it is possible to attain suitable (and desirable) instance-independent regret bounds, addressing an open problem posed by Cesa-Bianchi et al. [CBCP19].

algorithm, algorithm 2, learner, (15 more...)

arXiv.org Artificial Intelligence

2503.01701

Country:

North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Education (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.48)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Principal-Agent Multitasking: the Uniformity of Optimal Contracts and its Efficient Learning via Instrumental Regression

Zuo, Shiliang

arXiv.org Machine LearningMay-31-2024

This work studies the multitasking principal-agent problem. I first show a ``uniformity'' result. Specifically, when the tasks are perfect substitutes, and the agent's cost function is homogeneous to a certain degree, then the optimal contract only depends on the marginal utility of each task and the degree of homogeneity. I then study a setting where the marginal utility of each task is unknown so that the optimal contract must be learned or estimated with observational data. I identify this problem as a regression problem with measurement errors and observe that this problem can be cast as an instrumental regression problem. The current works observe that both the contract and the repeated observations (when available) can act as valid instrumental variables, and propose using the generalized method of moments estimator to compute an approximately optimal contract from offline data. I also study an online setting and show how the optimal contract can be efficiently learned in an online fashion using the two estimators. Here the principal faces an exploration-exploitation tradeoff: she must experiment with new contracts and observe their outcome whilst at the same time ensuring her experimentations are not deviating too much from the optimal contract. This work shows when repeated observations are available and agents are sufficiently ``diverse", the principal can achieve a very low $\widetilde{O}(d)$ cumulative utility loss, even with a ``pure exploitation" algorithm.

agent, contract, optimal contract, (17 more...)

arXiv.org Machine Learning

2405.20642

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)

Add feedback

Contracting with a Learning Agent

Guruganesh, Guru, Kolumbus, Yoav, Schneider, Jon, Talgam-Cohen, Inbal, Vlatakis-Gkaragkounis, Emmanouil-Vasileios, Wang, Joshua R., Weinberg, S. Matthew

arXiv.org Artificial IntelligenceJan-29-2024

Many real-life contractual relations differ completely from the clean, static model at the heart of principal-agent theory. Typically, they involve repeated strategic interactions of the principal and agent, taking place under uncertainty and over time. While appealing in theory, players seldom use complex dynamic strategies in practice, often preferring to circumvent complexity and approach uncertainty through learning. We initiate the study of repeated contracts with a learning agent, focusing on agents who achieve no-regret outcomes. Optimizing against a no-regret agent is a known open problem in general games; we achieve an optimal solution to this problem for a canonical contract setting, in which the agent's choice among multiple actions leads to success/failure. The solution has a surprisingly simple structure: for some $\alpha > 0$, initially offer the agent a linear contract with scalar $\alpha$, then switch to offering a linear contract with scalar $0$. This switch causes the agent to ``free-fall'' through their action space and during this time provides the principal with non-zero reward at zero cost. Despite apparent exploitation of the agent, this dynamic contract can leave \emph{both} players better off compared to the best static contract. Our results generalize beyond success/failure, to arbitrary non-linear contracts which the principal rescales dynamically. Finally, we quantify the dependence of our results on knowledge of the time horizon, and are the first to address this consideration in the study of strategizing against learning agents.

agent, contract, dynamic contract, (16 more...)

arXiv.org Artificial Intelligence

2401.16198

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Israel > Southern District > Eilat (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Add feedback

Learning in Online Principal-Agent Interactions: The Power of Menus

Han, Minbiao, Albert, Michael, Xu, Haifeng

arXiv.org Artificial IntelligenceDec-28-2023

We study a ubiquitous learning challenge in online principal-agent problems during which the principal learns the agent's private information from the agent's revealed preferences in historical interactions. This paradigm includes important special cases such as pricing and contract design, which have been widely studied in recent literature. However, existing work considers the case where the principal can only choose a single strategy at every round to interact with the agent and then observe the agent's revealed preference through their actions. In this paper, we extend this line of study to allow the principal to offer a menu of strategies to the agent and learn additionally from observing the agent's selection from the menu. We provide a thorough investigation of several online principal-agent problem settings and characterize their sample complexities, accompanied by the corresponding algorithms we have developed. We instantiate this paradigm to several important design problems $-$ including Stackelberg (security) games, contract design, and information design. Finally, we also explore the connection between our findings and existing results about online learning in Stackelberg games, and we offer a solution that can overcome a key hard instance of Peng et al. (2019).

agent, agent type, principal strategy, (17 more...)

arXiv.org Artificial Intelligence

2312.09869

Country:

North America > United States > Virginia (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Kosovo > District of Gjilan > Kamenica (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sequential Principal-Agent Problems with Communication: Efficient Computation and Learning

Gan, Jiarui, Majumdar, Rupak, Mandal, Debmalya, Radanovic, Goran

arXiv.org Artificial IntelligenceDec-17-2023

We study a sequential decision making problem between a principal and an agent with incomplete information on both sides. In this model, the principal and the agent interact in a stochastic environment, and each is privy to observations about the state not available to the other. The principal has the power of commitment, both to elicit information from the agent and to provide signals about her own information. The principal and the agent communicate their signals to each other, and select their actions independently based on this communication. Each player receives a payoff based on the state and their joint actions, and the environment moves to a new state. The interaction continues over a finite time horizon, and both players act to optimize their own total payoffs over the horizon. Our model encompasses as special cases stochastic games of incomplete information and POMDPs, as well as sequential Bayesian persuasion and mechanism design problems. We study both computation of optimal policies and learning in our setting. While the general problems are computationally intractable, we study algorithmic solutions under a conditional independence assumption on the underlying state-observation distributions. We present a polynomial-time algorithm to compute the principal's optimal policy up to an additive approximation. Additionally, we show an efficient learning algorithm in the case where the transition probabilities are not known beforehand. The algorithm guarantees sublinear regret for both players.

agent, algorithm, payoff, (15 more...)

arXiv.org Artificial Intelligence

2306.03832

Country:

Europe > Kosovo > District of Gjilan > Kamenica (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)
(2 more...)

Add feedback

Learning Optimal Contracts: How to Exploit Small Action Spaces

Bacchiocchi, Francesco, Castiglioni, Matteo, Marchesi, Alberto, Gatti, Nicola

arXiv.org Artificial IntelligenceNov-16-2023

We study principal-agent problems in which a principal commits to an outcome-dependent payment scheme -- called contract -- in order to induce an agent to take a costly, unobservable action leading to favorable outcomes. We consider a generalization of the classical (single-round) version of the problem in which the principal interacts with the agent by committing to contracts over multiple rounds. The principal has no information about the agent, and they have to learn an optimal contract by only observing the outcome realized at each round. We focus on settings in which the size of the agent's action space is small. We design an algorithm that learns an approximately-optimal contract with high probability in a number of rounds polynomial in the size of the outcome space, when the number of actions is constant. Our algorithm solves an open problem by Zhu et al.[2022]. Moreover, it can also be employed to provide a $\tilde{\mathcal{O}}(T^{4/5})$ regret bound in the related online learning setting in which the principal aims at maximizing their cumulative utility, thus considerably improving previously-known regret bounds.

algorithm, algorithm 2, contract, (14 more...)

arXiv.org Artificial Intelligence

2309.09801

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Essex > Colchester (0.04)
Europe > Middle East > Cyprus > Pafos > Paphos (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Of Models and Tin Men: A Behavioural Economics Study of Principal-Agent Problems in AI Alignment using Large-Language Models

Phelps, Steve, Ranson, Rebecca

arXiv.org Artificial IntelligenceSep-13-2023

AI Alignment is often presented as an interaction between a single designer and an artificial agent in which the designer attempts to ensure the agent's behavior is consistent with its purpose, and risks arise solely because of conflicts caused by inadvertent misalignment between the utility function intended by the designer and the resulting internal utility function of the agent. With the advent of agents instantiated with large-language models (LLMs), which are typically pre-trained, we argue this does not capture the essential aspects of AI safety because in the real world there is not a one-to-one correspondence between designer and agent, and the many agents, both artificial and human, have heterogeneous values. Therefore, there is an economic aspect to AI safety and the principal-agent problem is likely to arise. In a principal-agent problem conflict arises because of information asymmetry together with inherent misalignment between the utility of the agent and its principal, and this inherent misalignment cannot be overcome by coercing the agent into adopting a desired utility function through training. We argue the assumptions underlying principal-agent problems are crucial to capturing the essence of safety problems involving pre-trained AI models in real-world situations. Taking an empirical approach to AI safety, we investigate how GPT models respond in principal-agent conflicts. We find that agents based on both GPT-3.5 and GPT-4 override their principal's objectives in a simple online shopping task, showing clear evidence of principal-agent conflict. Surprisingly, the earlier GPT-3.5 model exhibits more nuanced behaviour in response to changes in information asymmetry, whereas the later GPT-4 model is more rigid in adhering to its prior alignment. Our results highlight the importance of incorporating principles from economics into the alignment process.

behavioural economics study, large-language model, principal-agent problem, (2 more...)

arXiv.org Artificial Intelligence

2307.11137

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Artificial Intelligence and Dual Contract

Fu, Wuming, Qi, Qian

arXiv.org Artificial IntelligenceMar-22-2023

Platforms are increasingly adopting Artificial Intelligence (henceforth, AI) algorithms (for example, the ChatGPT (Ouyang, Wu, Jiang, Almeida, Wainwright, Mishkin, Zhang, Agarwal, Slama, Ray, et al. (2022)) and Eloundou, Manning, Mishkin, and Rock (2023)) to intelligentize services and price their products, and this tendency is likely to be extended to other business areas, particularly contract design. In this paper, we ask whether contracting algorithms may "autonomously" learn to be incentive compatible, especially for the contracting problem with multiple sides, which we also refer to as the multi-sided contracting problem. We emphasize that AI algorithms can be used to automatically optimize the terms of a contract by taking into account the preferences of both sides and the legal and economic environment in which the agreement must be implemented. Note that this contract negotiation process is automatic, requiring very little external guidance. In light of these developments, concerns have been voiced by scholars and organizations alike, that AI algorithms may create an AI alignment problem due to differences between the specified reward function and what relevant humans (the designer, the user, others affected by the agent's behavior) actually value (see Hadfield-Menell and Hadfield (2019) and Gabriel (2020)). We highlight that this AI alignment problem has a clear analogy to the principal-agent problem (see Hadfield-Menell and Hadfield (2019)), and the analysis of incentive compatibility for the incomplete contracting via AI algorithms can provide an insightful framework for understanding the alignment among algorithms. But how real is the risk of misalignment among AI algorithms? That is a difficult question to answer, both empirically and theoretically. On the empirical side, alignment is notoriously hard to detect from market outcomes, and firms typically do not disclose details of the financial or employment contracts they have.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2303.1235

Country: Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback