A Dataless Reinforcement Learning Approach to Rounding Hyperplane Optimization for Max-Cut