value-directed compression
Value-Directed Compression of POMDPs
We examine the problem of generating state-space compressions of POMDPs in a way that minimally impacts decision quality. We analyze the impact of compres- sions on decision quality, observing that compressions that allow accurate policy evaluation (prediction of expected future reward) will not affect decision qual- ity. We derive a set of sufficient conditions that ensure accurate prediction in this respect, illustrate interesting mathematical properties these confer on lossless lin- ear compressions, and use these to derive an iterative procedure for finding good linear lossy compressions. We also elaborate on how structured representations of a POMDP can be used to find such compressions.
Value-Directed Compression of Large-Scale Assignment Problems
Lu, Tyler (University of Toronto) | Boutilier, Craig (University of Toronto)
Data-driven analytics — in areas ranging from consumer marketing to public policy — often allow behavior prediction at the level of individuals rather than population segments , offering the opportunity to improve decisions that impact large populations. Modeling such (generalized) assignment problems as linear programs, we propose a general value-directed compression technique for solving such problems at scale. We dynamically segment the population into cells using a form of column generation, constructing groups of individuals who can provably be treated identically in the optimal solution. This compression allows problems, unsolvable using standard LP techniques, to be solved effectively. Indeed, once a compressed LP is constructed, problems can solved in milliseconds. We provide a theoretical analysis of themethods, outline the distributed implementation of the requisite data processing, and show how a single compressed LP can be used to solve multiple variants of the original LP near-optimally in real-time (e.g., tosupport scenario analysis). We also show how the method can be leveraged in integer programming models. Experimental results on marketing contact optimization and political legislature problems validate the performance of our technique.