RePro: Training Language Models to Faithfully Recycle the Web for Pretraining