Training Language Models to Critique With Multi-agent Feedback