Improving AGI Evaluation: A Data Science Perspective