Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models