An Information-Theoretic Perspective on Overfitting and Underfitting