Evaluating Open-Domain Question Answering in the Era of Large Language Models