VISTA: A Vision and Intent-Aware Social Attention Framework for Multi-Agent Trajectory Prediction