Maximizing BERT model performance
While the tests above are purely illustrative samples for the three broad categories, practitioners can use just the few qualitative tests like the ones above to detect a model is performing poorly while or after training a model. The two Microsoft pre-trained models are examples of this -- they perform consistently poorly in all the tests. A poorly performing model in addition to inaccurate predictions, also exhibits other signs of inadequately/improperly pre-training -- they have the same signature noise like terms for different/distinct input sentences. However, to determine if a model is pre-trained for maximum performance, one would have to create sufficient number of test cases across these categories and then score the performance based on the top model predictions for the blanked positions.
Nov-4-2020, 19:20:38 GMT
- Technology: