Combinatorial Optimization with Policy Adaptation using Latent Space Search

Neural Information Processing Systems 

Combinatorial Optimization (CO) has a wide range of real-world applications, from transportation (Contardo et al., 2012) and logistics (Laterre et al., 2018), to energy (Froger et al., 2016). Concretely, leading RL methods typically train a policy to incrementally construct a solution one element at a time.