Evaluating the Quality of Answers in Political Q&A Sessions with Large Language Models