A Appendix

Neural Information Processing Systems 

However, one might argue that this analysis might not allow for sufficient differentiation between tasks. To address this concern, we expanded our evaluation to the entire MMLU benchmark. This enabled a comparable assessment of task similarity, akin to our earlier experiments.