Reward and Guidance through Rubrics: Promoting Exploration to Improve Multi-Domain Reasoning