Exploring Large Protein Language Models in Constrained Evaluation Scenarios within the FLIP Benchmark