Spatial Reasoning via Deep Vision Models for Robotic Sequential Manipulation