Understanding Shared Speech-Text Representations