Dynamic Evaluation of Large Language Models by Meta Probing Agents