Multi-Head State Space Model for Speech Recognition