VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM