Measuring and Augmenting Large Language Models for Solving Capture-the-Flag Challenges