Supplement: Hybrid Models for Learning to Branch

Neural Information Processing Systems 

In this section, we argue that the GNN architecture looses its advantages in the face of solving multiple MILPs at the same time. In the applications like multi-objective optimization [4], where multiple MILPs are solved in parallel, a GNN for each MILP needs to be initialized on the GPU because of the sequentially asynchronous nature of solving MILPs. Not only is there a limit to the number of such GNNs that can fit on a single GPU because of memory constraints, but also several GNNs on a single GPU results in an inefficient GPU utilization. One can, for instance, try to time multiple MILPs such that there is a need for a single forward evaluation on a GPU, but, in our knowledge, it has not been done and it results in frequent interruptions in the solving procedure. An alternative, much simpler, method is to pack multiple GNNs on a single GPU such that each GNN is dedicated to solving one MILP.