Insights Into the Inner Workings of Transformer Models for Protein Function Prediction