Temporal Information Reconstruction and Non-Aligned Residual in Spiking Neural Networks for Speech Classification