Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models