Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding