Introducing Semantics into Speech Encoders