Scaling Matters in Deep Structured-Prediction Models