Gradient Descent on Two-layer Nets: Margin Maximization and Simplicity Bias