Less is More: Data-Efficient Adaptation for Controllable Text-to-Video Generation