Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation