MoE-Gen: High-Throughput MoE Inference on a Single GPU with Module-Based Batching