An Oral Exam for Measuring a Dialog System’s Capabilities