A Reinforcement-Learning-Based Multiple-Column Selection Strategy for Column Generation