A Theory for Compressibility of Graph Transformers for Transductive Learning