Recommendation on a Budget: Column Space Recovery from Partially Observed Entries with Random or Active Sampling
In many applications of recommendation systems, we have data in the form of an incomplete matrix, where one dimension is growing and the other dimension is fixed. For instance, in recommendation systems, there is a fixed set of potential products (rows of a matrix) to offer customers that arrive over time (columns of a matrix). Three other applications are choosing machine learning models (rows) for each new customer's dataset (columns) [FSE18], choosing which survey questions (rows) to ask to respondents (columns) that arrive sequentially [ZTCS19], or choosing which lab tests (rows) to order for each new patient (columns) [HL14]. In these cases, there is an inherent asymmetry with respect to the dimensions in the budget: we have a budget over each column, not over each row. We could choose any machine learning model and recommend it for each dataset, or choose any survey question and give it to every user, but it is very hard to run every machine learning pipeline on an arbitrary dataset, or to give every survey question to an arbitrary respondent (indeed, in [ZTCS19], users omitting too many answers was the precise motivation for their problem).
Feb-26-2020