Unlocking the Theory Behind Scaling 1-Bit Neural Networks