LLM-Pruner: On the Structural Pruning of Large Language Models Xinyin Ma Gongfan Fang Xinchao Wang National University of Singapore