What are the best systems? New perspectives on NLP Benchmarking

Feb-10-2022, 00:28:34 GMT–#artificialintelligence

In Machine Learning, a benchmark refers to an ensemble of datasets associated with one or multiple metrics together with a way to aggregate different systems performances. They are instrumental in (i) assessing the progress of new methods along different axes and (ii) selecting the best systems for practical use. This is particularly the case for NLP with the development of large pre-trained models (e.g. GPT, BERT) that are expected to generalize well on a variety of tasks. While the community mainly focused on developing new datasets and metrics, there has been little interest in the aggregation procedure, which is often reduced to a simple average over various performance measures. However, this procedure can be problematic when the metrics are on a different scale, which may lead to spurious conclusions.

best system, nlp benchmarking, procedure, (2 more...)

#artificialintelligence

Feb-10-2022, 00:28:34 GMT

News Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence (0.60)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found