Dinitz, Michael
Binary Search with Distributional Predictions
Dinitz, Michael, Im, Sungjin, Lavastida, Thomas, Moseley, Benjamin, Niaparast, Aidin, Vassilvitskii, Sergei
Algorithms with (machine-learned) predictions is a powerful framework for combining traditional worst-case algorithms with modern machine learning. However, the vast majority of work in this space assumes that the prediction itself is non-probabilistic, even if it is generated by some stochastic process (such as a machine learning system). This is a poor fit for modern ML, particularly modern neural networks, which naturally generate a distribution. We initiate the study of algorithms with distributional predictions, where the prediction itself is a distribution. We focus on one of the simplest yet fundamental settings: binary search (or searching a sorted array). This setting has one of the simplest algorithms with a point prediction, but what happens if the prediction is a distribution? We show that this is a richer setting: there are simple distributions where using the classical prediction-based algorithm with any single prediction does poorly. Motivated by this, as our main result, we give an algorithm with query complexity $O(H(p) + \log \eta)$, where $H(p)$ is the entropy of the true distribution $p$ and $\eta$ is the earth mover's distance between $p$ and the predicted distribution $\hat p$. This also yields the first distributionally-robust algorithm for the classical problem of computing an optimal binary search tree given a distribution over target keys. We complement this with a lower bound showing that this query complexity is essentially optimal (up to constants), and experiments validating the practical usefulness of our algorithm.
Noise is All You Need: Private Second-Order Convergence of Noisy SGD
Avdiukhin, Dmitrii, Dinitz, Michael, Fan, Chenglin, Yaroslavtsev, Grigory
Private optimization is a topic of major interest in machine learning, with differentially private stochastic gradient descent (DP-SGD) playing a key role in both theory and practice. Furthermore, DP-SGD is known to be a powerful tool in contexts beyond privacy, including robustness, machine unlearning, etc. Existing analyses of DP-SGD either make relatively strong assumptions (e.g., Lipschitz continuity of the loss function, or even convexity) or prove only first-order convergence (and thus might end at a saddle point in the non-convex setting). At the same time, there has been progress in proving second-order convergence of the non-private version of ``noisy SGD'', as well as progress in designing algorithms that are more complex than DP-SGD and do guarantee second-order convergence. We revisit DP-SGD and show that ``noise is all you need'': the noise necessary for privacy already implies second-order convergence under the standard smoothness assumptions, even for non-Lipschitz loss functions. Hence, we get second-order convergence essentially for free: DP-SGD, the workhorse of modern private optimization, under minimal assumptions can be used to find a second-order stationary point.
Algorithms with Prediction Portfolios
Dinitz, Michael, Im, Sungjin, Lavastida, Thomas, Moseley, Benjamin, Vassilvitskii, Sergei
The research area of algorithms with predictions has seen recent success showing how to incorporate machine learning into algorithm design to improve performance when the predictions are correct, while retaining worst-case guarantees when they are not. Most previous work has assumed that the algorithm has access to a single predictor. However, in practice, there are many machine learning methods available, often with incomparable generalization guarantees, making it hard to pick a best method a priori. In this work we consider scenarios where multiple predictors are available to the algorithm and the question is how to best utilize them. Ideally, we would like the algorithm's performance to depend on the quality of the best predictor. However, utilizing more predictions comes with a cost, since we now have to identify which prediction is the best. We study the use of multiple predictors for a number of fundamental problems, including matching, load balancing, and non-clairvoyant scheduling, which have been well-studied in the single predictor setting. For each of these problems we introduce new algorithms that take advantage of multiple predictors, and prove bounds on the resulting performance.
Policy Regret in Repeated Games
Arora, Raman, Dinitz, Michael, Marinov, Teodor Vanislavov, Mohri, Mehryar
The notion of ``policy regret'' in online learning is supposed to capture the reactions of the adversary to the actions taken by the learner, which more traditional notions such as external regret do not take into account. We revisit this notion of policy regret, and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play which does well with respect to one must do poorly with respect to the other. We then focus on the game theoretic setting, when the adversary is a self-interested agent. In this setting we show that the external regret and policy regret are not in conflict, and in fact that a wide class of algorithms can ensure both as long as the adversary is also using such an algorithm. We also define a new notion of equilibrium which we call a ``policy equilibrium'', and show that no-policy regret algorithms will have play which converges to such an equilibrium. Relating this back to external regret, we show that coarse correlated equilibria (which no-external regret players will converge to) are a strict subset of policy equilibria. So in game-theoretic settings every sequence of play with no external regret also has no policy regret, but the converse is not true.
Policy Regret in Repeated Games
Arora, Raman, Dinitz, Michael, Marinov, Teodor Vanislavov, Mohri, Mehryar
The notion of ``policy regret'' in online learning is supposed to capture the reactions of the adversary to the actions taken by the learner, which more traditional notions such as external regret do not take into account. We revisit this notion of policy regret, and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play which does well with respect to one must do poorly with respect to the other. We then focus on the game theoretic setting, when the adversary is a self-interested agent. In this setting we show that the external regret and policy regret are not in conflict, and in fact that a wide class of algorithms can ensure both as long as the adversary is also using such an algorithm. We also define a new notion of equilibrium which we call a ``policy equilibrium'', and show that no-policy regret algorithms will have play which converges to such an equilibrium. Relating this back to external regret, we show that coarse correlated equilibria (which no-external regret players will converge to) are a strict subset of policy equilibria. So in game-theoretic settings every sequence of play with no external regret also has no policy regret, but the converse is not true.
Policy Regret in Repeated Games
Arora, Raman, Dinitz, Michael, Marinov, Teodor V., Mohri, Mehryar
The notion of \emph{policy regret} in online learning is a well defined? performance measure for the common scenario of adaptive adversaries, which more traditional quantities such as external regret do not take into account. We revisit the notion of policy regret and first show that there are online learning settings in which policy regret and external regret are incompatible: any sequence of play that achieves a favorable regret with respect to one definition must do poorly with respect to the other. We then focus on the game-theoretic setting where the adversary is a self-interested agent. In that setting, we show that external regret and policy regret are not in conflict and, in fact, that a wide class of algorithms can ensure a favorable regret with respect to both definitions, so long as the adversary is also using such an algorithm. We also show that the sequence of play of no-policy regret algorithms converges to a \emph{policy equilibrium}, a new notion of equilibrium that we introduce. Relating this back to external regret, we show that coarse correlated equilibria, which no-external regret players converge to, are a strict subset of policy equilibria. Thus, in game-theoretic settings, every sequence of play with no external regret also admits no policy regret, but the converse does not hold.