Data Proportion Detection for Optimized Data Management for Large Language Models