Benchmarking Large Language Models for Molecule Prediction Tasks