Enhancing Subtask Performance of Multi-modal Large Language Model