See What You Are Told: Visual Attention Sink in Large Multimodal Models