Neural Combinatorial Optimization with Heavy Decoder: Toward Large Scale Generalization