Piano Transcription by Hierarchical Language Modeling with Pretrained Roll-based Encoders