Scaling Rich Style-Prompted Text-to-Speech Datasets