Goto

Collaborating Authors

 Aboagye, Prince


Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach

arXiv.org Artificial Intelligence

The emergence of pre-trained models has significantly impacted Natural Language Processing (NLP) and Computer Vision to relational datasets. Traditionally, these models are assessed through fine-tuned downstream tasks. However, this raises the question of how to evaluate these models more efficiently and effectively. In this study, we explore a novel approach where we leverage the metafeatures associated with each entity as a source of worldly knowledge and employ entity representations from the models. We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models. Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models, and image models. Pre-training on large models is becoming increasingly common in various machine learning applications, thanks to the growing amount of user-generated content. This is evident in areas such as Natural Language Processing (NLP) with models like GPT (Generative Pretrained Transformer), and in the vision-language domain with models like CLIP. Typically, the effectiveness of these models is evaluated using downstream tasks. However, these can be relatively costly if all tasks need to be performed.