On Limitation of Transformer for Learning HMMs