Spectrum-guided Multi-granularity Referring Video Object Segmentation