VideoGameBench: Can Vision-Language Models complete popular video games?