Are ResNets Provably Better than Linear Predictors?