Large Language Models Implicitly Learn to See and Hear Just By Reading