stable matching
- Asia > Middle East > Jordan (0.04)
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Education (0.93)
- Banking & Finance (0.68)
- North America > United States > California (0.04)
- North America > United States > Texas (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Research Report > New Finding (0.93)
- Research Report > Experimental Study (0.93)
- North America > United States (0.14)
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- Europe > United Kingdom > England > West Yorkshire > Leeds (0.04)
- (2 more...)
- Transportation > Ground > Road (0.67)
- Transportation > Electric Vehicle (0.46)
- Education > Educational Setting (0.46)
- North America > Canada > Quebec > Montreal (0.04)
- North America > United States (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (9 more...)
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (2 more...)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (2 more...)
Match Made with Matrix Completion: Efficient Learning under Matching Interference
Tang, Zhiyuan, Chen, Wanning, Xu, Kan
Matching markets face increasing needs to learn the matching qualities between demand and supply for effective design of matching policies. In practice, the matching rewards are high-dimensional due to the growing diversity of participants. We leverage a natural low-rank matrix structure of the matching rewards in these two-sided markets, and propose to utilize matrix completion to accelerate reward learning with limited offline data. A unique property for matrix completion in this setting is that the entries of the reward matrix are observed with matching interference -- i.e., the entries are not observed independently but dependently due to matching or budget constraints. Such matching dependence renders unique technical challenges, such as sub-optimality or inapplicability of the existing analytical tools in the matrix completion literature, since they typically rely on sample independence. In this paper, we first show that standard nuclear norm regularization remains theoretically effective under matching interference. We provide a near-optimal Frobenius norm guarantee in this setting, coupled with a new analytical technique. Next, to guide certain matching decisions, we develop a novel ``double-enhanced'' estimator, based off the nuclear norm estimator, with a near-optimal entry-wise guarantee. Our double-enhancement procedure can apply to broader sampling schemes even with dependence, which may be of independent interest. Additionally, we extend our approach to online learning settings with matching constraints such as optimal matching and stable matching, and present improved regret bounds in matrix dimensions. Finally, we demonstrate the practical value of our methods using both synthetic data and real data of labor markets.
- North America > United States > Texas (0.04)
- North America > United States > Arizona (0.04)
- Asia > Middle East > Jordan (0.04)
Optimal Analysis for Bandit Learning in Matching Markets with Serial Dictatorship
The problem of two-sided matching markets is well-studied in computer science and economics, owing to its diverse applications across numerous domains. Since market participants are usually uncertain about their preferences in various online matching platforms, an emerging line of research is dedicated to the online setting where one-side participants (players) learn their unknown preferences through multiple rounds of interactions with the other side (arms). Sankararaman et al. provide an $Ω\left( \frac{N\log(T)}{Δ^2} + \frac{K\log(T)}Δ \right)$ regret lower bound for this problem under serial dictatorship assumption, where $N$ is the number of players, $K (\geq N)$ is the number of arms, $Δ$ is the minimum reward gap across players and arms, and $T$ is the time horizon. Serial dictatorship assumes arms have the same preferences, which is common in reality when one side participants have a unified evaluation standard. Recently, the work of Kong and Li proposes the ET-GS algorithm and achieves an $O\left( \frac{K\log(T)}{Δ^2} \right)$ regret upper bound, which is the best upper bound attained so far. Nonetheless, a gap between the lower and upper bounds, ranging from $N$ to $K$, persists. It remains unclear whether the lower bound or the upper bound needs to be improved. In this paper, we propose a multi-level successive selection algorithm that obtains an $O\left( \frac{N\log(T)}{Δ^2} + \frac{K\log(T)}Δ \right)$ regret bound when the market satisfies serial dictatorship. To the best of our knowledge, we are the first to propose an algorithm that matches the lower bound in the problem of matching markets with bandits.
- Asia > Middle East > Jordan (0.04)
- Europe > Kosovo > District of Gjilan > Kamenica (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Shanghai > Shanghai (0.04)