Reviews: Post: Device Placement with Cross-Entropy Minimization and Proximal Policy Optimization

Neural Information Processing Systems 

This is a great work as it tackles an important problem: graph partitioning in heterogeneous/multi-device settings. There is an increasing number of problems that could benefit from resource allocation optimization techniques such as the one described in this work. ML and specifically RL techniques have been recently developed to solve the problem of device placement. This work addresses one of the main deficiencies of the prior work by making more sample efficient (as demonstrated by empirical results). The novelty is in the way the placement parameters are trained: As oppose to directly train a placement policy for best runtime, a softmax is used to model the distribution of op placements on devices (for each device among the pool of available devices.)