A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models