Review for NeurIPS paper: Gradient Surgery for Multi-Task Learning

Jan-23-2025, 17:58:55 GMT–Neural Information Processing Systems

Though the toy example in Figure 1 is quite intuitive and proof of convergence under convex case is provided, the paper misses an important part: how severe is the conflicting gradient problem is in practical tasks? I suggest the authors to draw some plots of the cosine similarity between the gradients from different losses. I also went through the appendix, but failed to find such plots. This is my main concern. Though I understand it's a workaround for practical use, it makes the generalization of Theorem 1 and 2 to the case n 2 (the number of task loss functions) non-trivial.

gradient surgery, multi-task learning, neurips paper, (2 more...)

Neural Information Processing Systems

Jan-23-2025, 17:58:55 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.83)