Optimizing Language Model's Reasoning Abilities with Weak Supervision