Time and Tokens: Benchmarking End-to-End Speech Dysfluency Detection