Helix Parallelism: Rethinking Sharding Strategies for Interactive Multi-Million-Token LLM Decoding