Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment