Goto

Collaborating Authors

 Jordan, Michael I.


The Sample Complexity of Online Contract Design

arXiv.org Artificial Intelligence

We study the hidden-action principal-agent problem in an online setting. In each round, the principal posts a contract that specifies the payment to the agent based on each outcome. The agent then makes a strategic choice of action that maximizes her own utility, but the action is not directly observable by the principal. The principal observes the outcome and receives utility from the agent's choice of action. Based on past observations, the principal dynamically adjusts the contracts with the goal of maximizing her utility. We introduce an online learning algorithm and provide an upper bound on its Stackelberg regret. We show that when the contract space is $[0,1]^m$, the Stackelberg regret is upper bounded by $\widetilde O(\sqrt{m} \cdot T^{1-1/(2m+1)})$, and lower bounded by $\Omega(T^{1-1/(m+2)})$, where $\widetilde O$ omits logarithmic factors. This result shows that exponential-in-$m$ samples are sufficient and necessary to learn a near-optimal contract, resolving an open problem on the hardness of online contract design. Moreover, when contracts are restricted to some subset $\mathcal{F} \subset [0,1]^m$, we define an intrinsic dimension of $\mathcal{F}$ that depends on the covering number of the spherical code in the space and bound the regret in terms of this intrinsic dimension. When $\mathcal{F}$ is the family of linear contracts, we show that the Stackelberg regret grows exactly as $\Theta(T^{2/3})$. The contract design problem is challenging because the utility function is discontinuous. Bounding the discretization error in this setting has been an open problem. In this paper, we identify a limited set of directions in which the utility function is continuous, allowing us to design a new discretization method and bound its error. This approach enables the first upper bound with no restrictions on the contract and action space.


Online Learning in a Creator Economy

arXiv.org Artificial Intelligence

The creator economy refers to a rapidly growing online-platform-facilitated economy that brings together content creators and users, allowing creators to earn revenue from their creations [Banks and Humphreys, 2008, Bhargava, 2022, El Sanyoura and Anderson, 2022, Radionova and Trots, 2021, Schram, 2020]. These platforms monetize the content created by content creators through various means, including paid audience partnerships, ad revenue, tipping platforms, and product sales provided by the users. The creator economy can be viewed as a three-party game linking users, platform, and content creators. On the one hand, we can model the interactions between the platform and the content creator as a principal-agent relationship, focusing on the need to incentivize the production of high-quality content by the content creator. The platform would like to collect better content to enhance the desirability of the platform. The content creator wishes to gain profit on the platform from their content. The two sides develop agreements in the form of a contract, which specifies how much the platform would pay under the different possible kinds of content. By learning the intent and interest of the content creator, the platform is able to identify a better way to incentivize participation and share the profit with the content creator. The overall framework is contract theory, which is a branch of the theory of incentives [Bolton and Dewatripont, 2004, Faure-Grimaud et al., 2001, Grossman and Hart, 1992, Salaniรฉ, 2005].


Online Learning in Stackelberg Games with an Omniscient Follower

arXiv.org Artificial Intelligence

We study the problem of online learning in a two-player decentralized cooperative Stackelberg game. In each round, the leader first takes an action, followed by the follower who takes their action after observing the leader's move. The goal of the leader is to learn to minimize the cumulative regret based on the history of interactions. Differing from the traditional formulation of repeated Stackelberg games, we assume the follower is omniscient, with full knowledge of the true reward, and that they always best-respond to the leader's actions. We analyze the sample complexity of regret minimization in this repeated Stackelberg game. We show that depending on the reward structure, the existence of the omniscient follower may change the sample complexity drastically, from constant to exponential, even for linear cooperative Stackelberg games. This poses unique challenges for the learning process of the leader and the subsequent regret analysis.


A Primal-dual Approach for Solving Variational Inequalities with General-form Constraints

arXiv.org Artificial Intelligence

Yang et al. (2023) recently addressed the open problem of solving Variational Inequalities (VIs) with equality and inequality constraints through a first-order gradient method. However, the proposed primal-dual method called ACVI is applicable when we can compute analytic solutions of its subproblems; thus, the general case remains an open problem. In this paper, we adopt a warm-starting technique where we solve the subproblems approximately at each iteration and initialize the variables with the approximate solution found at the previous iteration. We prove its convergence and show that the gap function of the last iterate of this inexact-ACVI method decreases at a rate of $\mathcal{O}(\frac{1}{\sqrt{K}})$ when the operator is $L$-Lipschitz and monotone, provided that the errors decrease at appropriate rates. Interestingly, we show that often in numerical experiments, this technique converges faster than its exact counterpart. Furthermore, for the cases when the inequality constraints are simple, we propose a variant of ACVI named P-ACVI and prove its convergence for the same setting. We further demonstrate the efficacy of the proposed methods through numerous experiments. We also relax the assumptions in Yang et al., yielding, to our knowledge, the first convergence result that does not rely on the assumption that the operator is $L$-Lipschitz. Our source code is provided at $\texttt{https://github.com/mpagli/Revisiting-ACVI}$.


Valid Inference after Causal Discovery

arXiv.org Artificial Intelligence

Causal discovery and causal estimation are fundamental tasks in causal reasoning and decision-making. Causal discovery aims to identify the underlying structure of the causal problem, often in the form of a graphical representation which makes explicit which variables causally influence which other variables, while causal estimation aims to quantify the magnitude of the effect of one variable on another. These two goals frequently go hand in hand: quantifying causal effects requires adjustments that rely on either assuming or discovering the underlying graphical structure. Methodologies for causal discovery and causal estimation have mostly been developed separately, and the statistical challenges that arise when solving these problems jointly have largely been overlooked. Indeed, a naive black-box combination of causal discovery algorithms and standard inference methods for causal effects suffers from "double dipping." That is, classical confidence intervals, such as those used for linear regression coefficients, need no longer cover the target estimand if the causal structure is not fixed a priori but is estimated on the same data used to compute the intervals.


Byzantine-Robust Federated Learning with Optimal Statistical Rates and Privacy Guarantees

arXiv.org Artificial Intelligence

We propose Byzantine-robust federated learning protocols with nearly optimal statistical rates. In contrast to prior work, our proposed protocols improve the dimension dependence and achieve a tight statistical rate in terms of all the parameters for strongly convex losses. We benchmark against competing protocols and show the empirical superiority of the proposed protocols. Finally, we remark that our protocols with bucketing can be naturally combined with privacy-guaranteeing procedures to introduce security against a semi-honest server. The code for evaluation is provided in https://github.com/wanglun1996/secure-robust-federated-learning.


Solving Constrained Variational Inequalities via a First-order Interior Point-based Method

arXiv.org Artificial Intelligence

We develop an interior-point approach to solve constrained variational inequality (cVI) problems. Inspired by the efficacy of the alternating direction method of multipliers (ADMM) method in the single-objective context, we generalize ADMM to derive a first-order method for cVIs, that we refer to as ADMM-based interior-point method for constrained VIs (ACVI). We provide convergence guarantees for ACVI in two general classes of problems: (i) when the operator is $\xi$-monotone, and (ii) when it is monotone, some constraints are active and the game is not purely rotational. When the operator is, in addition, L-Lipschitz for the latter case, we match known lower bounds on rates for the gap function of $\mathcal{O}(1/\sqrt{K})$ and $\mathcal{O}(1/K)$ for the last and average iterate, respectively. To the best of our knowledge, this is the first presentation of a first-order interior-point method for the general cVI problem that has a global convergence guarantee. Moreover, unlike previous work in this setting, ACVI provides a means to solve cVIs when the constraints are nontrivial. Empirical analyses demonstrate clear advantages of ACVI over common first-order methods. In particular, (i) cyclical behavior is notably reduced as our methods approach the solution from the analytic center, and (ii) unlike projection-based methods that zigzag when near a constraint, ACVI efficiently handles the constraints.


On-Demand Sampling: Learning Optimally from Multiple Distributions

arXiv.org Artificial Intelligence

Social and real-world considerations such as robustness, fairness, social welfare and multi-agent tradeoffs have given rise to multi-distribution learning paradigms, such as collaborative, group distributionally robust, and fair federated learning. In each of these settings, a learner seeks to minimize its worst-case loss over a set of $n$ predefined distributions, while using as few samples as possible. In this paper, we establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity. Importantly, our sample complexity bounds exceed that of the sample complexity of learning a single distribution only by an additive factor of $n \log(n) / \epsilon^2$. These improve upon the best known sample complexity of agnostic federated learning by Mohri et al. by a multiplicative factor of $n$, the sample complexity of collaborative learning by Nguyen and Zakynthinou by a multiplicative factor $\log n / \epsilon^3$, and give the first sample complexity bounds for the group DRO objective of Sagawa et al. To achieve optimal sample complexity, our algorithms learn to sample and learn from distributions on demand. Our algorithm design and analysis is enabled by our extensions of stochastic optimization techniques for solving stochastic zero-sum games. In particular, we contribute variants of Stochastic Mirror Descent that can trade off between players' access to cheap one-off samples or more expensive reusable ones.


Finding Regularized Competitive Equilibria of Heterogeneous Agent Macroeconomic Models with Reinforcement Learning

arXiv.org Artificial Intelligence

We study a heterogeneous agent macroeconomic model with an infinite number of households and firms competing in a labor market. Each household earns income and engages in consumption at each time step while aiming to maximize a concave utility subject to the underlying market conditions. The households aim to find the optimal saving strategy that maximizes their discounted cumulative utility given the market condition, while the firms determine the market conditions through maximizing corporate profit based on the household population behavior. The model captures a wide range of applications in macroeconomic studies, and we propose a data-driven reinforcement learning framework that finds the regularized competitive equilibrium of the model. The proposed algorithm enjoys theoretical guarantees in converging to the equilibrium of the market at a sub-linear rate.


Deterministic Nonsmooth Nonconvex Optimization

arXiv.org Artificial Intelligence

We study the complexity of optimizing nonsmooth nonconvex Lipschitz functions by producing $(\delta,\epsilon)$-stationary points. Several recent works have presented randomized algorithms that produce such points using $\tilde O(\delta^{-1}\epsilon^{-3})$ first-order oracle calls, independent of the dimension $d$. It has been an open problem as to whether a similar result can be obtained via a deterministic algorithm. We resolve this open problem, showing that randomization is necessary to obtain a dimension-free rate. In particular, we prove a lower bound of $\Omega(d)$ for any deterministic algorithm. Moreover, we show that unlike smooth or convex optimization, access to function values is required for any deterministic algorithm to halt within any finite time. On the other hand, we prove that if the function is even slightly smooth, then the dimension-free rate of $\tilde O(\delta^{-1}\epsilon^{-3})$ can be obtained by a deterministic algorithm with merely a logarithmic dependence on the smoothness parameter. Motivated by these findings, we turn to study the complexity of deterministically smoothing Lipschitz functions. Though there are efficient black-box randomized smoothings, we start by showing that no such deterministic procedure can smooth functions in a meaningful manner, resolving an open question. We then bypass this impossibility result for the structured case of ReLU neural networks. To that end, in a practical white-box setting in which the optimizer is granted access to the network's architecture, we propose a simple, dimension-free, deterministic smoothing that provably preserves $(\delta,\epsilon)$-stationary points. Our method applies to a variety of architectures of arbitrary depth, including ResNets and ConvNets. Combined with our algorithm, this yields the first deterministic dimension-free algorithm for optimizing ReLU networks, circumventing our lower bound.