Discrete Compositional Generation via General Soft Operators and Robust Reinforcement Learning