Is AI at Human Parity Yet? A Case Study on Speech Recognition
For ASR, this milestone was first claimed in a 2016 research paper by Microsoft (Xiong et al., 2016) reporting that for the first time, they have achieved human parity in word error rate1 (WER) on the Switchboard benchmark (5.8% WER) while also achieving 11% WER on the CallHome benchmark, which is known to be more challenging to transcribe. In addition, the reported decoding speed was only 1.38 real time, which is in the realm of usability for some commercial systems. This announcement was highly publicized even in mainstream media outlets2. A follow-up paper in 2017 claimed further improvement to 5.1% WER on Switchboard but with no report on decoding speed (Xiong et al., 2018). Also in 2017, Google announced a 4.9% WER (on some undisclosed benchmark) at its annual I/O developer conference3.
Jan-31-2023, 15:29:48 GMT
- Technology: