Beyond URLs: Metadata Diversity and Position for Efficient LLM Pretraining