Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach
Aboagye, Prince, Zheng, Yan, Wang, Junpeng, Saini, Uday Singh, Dai, Xin, Yeh, Michael, Fan, Yujie, Zhuang, Zhongfang, Jain, Shubham, Wang, Liang, Zhang, Wei
–arXiv.org Artificial Intelligence
The emergence of pre-trained models has significantly impacted Natural Language Processing (NLP) and Computer Vision to relational datasets. Traditionally, these models are assessed through fine-tuned downstream tasks. However, this raises the question of how to evaluate these models more efficiently and effectively. In this study, we explore a novel approach where we leverage the metafeatures associated with each entity as a source of worldly knowledge and employ entity representations from the models. We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models. Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models, and image models. Pre-training on large models is becoming increasingly common in various machine learning applications, thanks to the growing amount of user-generated content. This is evident in areas such as Natural Language Processing (NLP) with models like GPT (Generative Pretrained Transformer), and in the vision-language domain with models like CLIP. Typically, the effectiveness of these models is evaluated using downstream tasks. However, these can be relatively costly if all tasks need to be performed.
arXiv.org Artificial Intelligence
Jan-15-2024