Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models