An analytic theory of generalization dynamics and transfer learning in deep linear networks