On the Sample Complexity of Two-Layer Networks: Lipschitz vs. Element-Wise Lipschitz Activation