Towards Coherent and Engaging Spoken Dialog Response Generation Using Automatic Conversation Evaluators