How to Evaluate Automatic Speech Recognition: Comparing Different Performance and Bias Measures