Frozen Transformers in Language Models Are Effective Visual Encoder Layers