Learning from Massive Human Videos for Universal Humanoid Pose Control