AI Software Outperforms Humans on Reading Comprehension Test

#artificialintelligence 

Microsoft and Alibaba have independently developed AI models that scored better than humans in a Stanford University reading comprehension test. This AI milestone was reached using the Stanford Question Answering Dataset (SQuAD), which consists of over 10,000 question-and-answer pairs that apply to more than 500 Wikipedia articles. Alibaba's model achieved a score of 82.44, while the submission from Microsoft Research Asia bested that with a mark of 82.65. The human score for the SQuAD test is 82.304. Although that's a slim margin to claim superior performance, it represents the first time any natural language processing (NLP) software has been able to eclipse humans on this particular benchmark.