Goto

Collaborating Authors

 misreport






A Proofs

Neural Information Processing Systems

This appendix collects all proofs omitted from the main text due to space limitation. In this section, we prove Proposition 2.3 and the feasibility of the projected mechanisms. Thus, orbit averaging is a projection to equivariant function space and fixes all equivariant functions. Proof of the feasibility of projected mechanisms. A.2 Proof of Theorem 3.1 In this section, we proves Theorem 3.1.


Learning Strategy-Aware Linear Classifiers

Neural Information Processing Systems

We address the question of repeatedly learning linear classifiers against agents who are strategically trying to game the deployed classifiers, and we use the Stackelberg regret to measure the performance of our algorithms. First, we show that Stackelberg and external regret for the problem of strategic classification are strongly incompatible: i.e., there exist worst-case scenarios, where any sequence of actions providing sublinear external regret might result in linear Stackelberg regret and vice versa. Second, we present a strategy-aware algorithm for minimizing the Stackelberg regret for which we prove nearly matching upper and lower regret bounds. Finally, we provide simulations to complement our theoretical analysis. Our results advance the growing literature of learning from revealed preferences, which has so far focused on "smoother" assumptions from the perspective of the learner and the agents respectively.


Two-Sided Manipulation Games in Stable Matching Markets

Hosseini, Hadi, Lisowski, Grzegorz, Pathak, Shraddha

arXiv.org Artificial Intelligence

The Deferred Acceptance (DA) algorithm is an elegant procedure for finding a stable matching in two-sided matching markets. It ensures that no pair of agents prefers each other to their matched partners. In this work, we initiate the study of two-sided manipulations in matching markets as non-cooperative games. We introduce the accomplice manipulation game, where a man misreports to help a specific woman obtain a better partner, whenever possible. We provide a polynomial time algorithm for finding a pure strategy Nash equilibrium (NE) and show that our algorithm always yields a stable matching - although not every Nash equilibrium corresponds to a stable matching. Additionally, we show how our analytical techniques for the accomplice manipulation game can be applied to other manipulation games in matching markets, such as one-for-many and the standard self-manipulation games. We complement our theoretical findings with empirical evaluations of different properties of the resulting NE, such as the welfare of the agents.


Estimating Misreporting in the Presence of Genuine Modification: A Causal Perspective

Zapzalka, Dylan, Chang, Trenton, Warrenburg, Lindsay, Park, Sae-Hwan, Shenfeld, Daniel K., Parikh, Ravi B., Wiens, Jenna, Makar, Maggie

arXiv.org Artificial Intelligence

In settings where ML models are used to inform the allocation of resources, agents affected by the allocation decisions might have an incentive to strategically change their features to secure better outcomes. While prior work has studied strategic responses broadly, disentangling misreporting from genuine modification remains a fundamental challenge. In this paper, we propose a causally-motivated approach to identify and quantify how much an agent misreports on average by distinguishing deceptive changes in their features from genuine modification. Our key insight is that, unlike genuine modification, misreported features do not causally affect downstream variables (i.e., causal descendants). We exploit this asymmetry by comparing the causal effect of misreported features on their causal descendants as derived from manipulated datasets against those from unmanipulated datasets. We formally prove identifiability of the misreporting rate and characterize the variance of our estimator. We empirically validate our theoretical results using a semi-synthetic and real Medicare dataset with misreported data, demonstrating that our approach can be employed to identify misreporting in real-world scenarios.


Online Housing Market

Lesca, Julien

arXiv.org Artificial Intelligence

This paper studies an online variant of the celebrated housing market problem, where each agent has a single house and seeks to exchange it for another based on her preferences. In this online setting, agents may arrive and depart at any time, meaning that not all agents are present on the housing market simultaneously. I extend the well known serial dictatorship and Gale s top trading cycle mechanisms to this online scenario, aiming to retain their desirable properties such as Pareto efficiency, individual rationality, and strategy proofness. These extensions also seek to prevent agents from strategically delaying their arrival or advancing their departure. I demonstrate that achieving all of these properties simultaneously is impossible in the online context, and I present several variants that achieve different subsets of these properties.


Truthful mechanisms for linear bandit games with private contexts

Hu, Yiting, Duan, Lingjie

arXiv.org Artificial Intelligence

The contextual bandit problem, where agents arrive sequentially with personal contexts and the system adapts its arm allocation decisions accordingly, has recently garnered increasing attention for enabling more personalized outcomes. However, in many healthcare and recommendation applications, agents have private profiles and may misreport their contexts to gain from the system. For example, in adaptive clinical trials, where hospitals sequentially recruit volunteers to test multiple new treatments and adjust plans based on volunteers' reported profiles such as symptoms and interim data, participants may misreport severe side effects like allergy and nausea to avoid perceived suboptimal treatments. We are the first to study this issue of private context misreporting in a stochastic contextual bandit game between the system and non-repeated agents. We show that traditional low-regret algorithms, such as UCB family algorithms and Thompson sampling, fail to ensure truthful reporting and can result in linear regret in the worst case, while traditional truthful algorithms like explore-then-commit (ETC) and $\epsilon$-greedy algorithm incur sublinear but high regret. We propose a mechanism that uses a linear program to ensure truthfulness while minimizing deviation from Thompson sampling, yielding an $O(\ln T)$ frequentist regret. Our numerical experiments further demonstrate strong performance in multiple contexts and across other distribution families.