Mars-PO: Multi-Agent Reasoning System Preference Optimization

Open in new window