Gradient Descent Converges Linearly to Flatter Minima than Gradient Flow in Shallow Linear Networks