InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption