Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding