ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning