AITopics | cagrad

The goal of multi-task learning is to enable more efficient learning than single task learning by sharing model structures for a diverse set of tasks.

artificial intelligence, arxivpreprintarxiv, machine learning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the Convergence of Stochastic Multi-Objective Gradient Manipulation and Beyond Shiji Zhou

Neural Information Processing SystemsAug-19-2025, 21:02:51 GMT

Severe gradient conflicts will lead to significantly degraded model performance.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

9d27fdf2477ffbff837d73ef7ae23db9-Supplemental.pdf

Neural Information Processing SystemsAug-16-2025, 09:40:21 GMT

artificial intelligence, cagrad, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

9d27fdf2477ffbff837d73ef7ae23db9-Paper.pdf

Neural Information Processing SystemsAug-16-2025, 09:40:17 GMT

artificial intelligence, machine learning, optimization problem, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Injecting Imbalance Sensitivity for Multi-Task Learning

Zhou, Zhipeng, Liu, Liu, Zhao, Peilin, Gong, Wei

arXiv.org Artificial IntelligenceMar-10-2025

Multi-task learning (MTL) has emerged as a promising approach for deploying deep learning models in real-life applications. Recent studies have proposed optimization-based learning paradigms to establish task-shared representations in MTL. However, our paper empirically argues that these studies, specifically gradient-based ones, primarily emphasize the conflict issue while neglecting the potentially more significant impact of imbalance/dominance in MTL. In line with this perspective, we enhance the existing baseline method by injecting imbalance-sensitivity through the imposition of constraints on the projected norms. To demonstrate the effectiveness of our proposed IMbalance-sensitive Gradient (IMGrad) descent method, we evaluate it on multiple mainstream MTL benchmarks, encompassing supervised learning tasks as well as reinforcement learning. The experimental results consistently demonstrate competitive performance.

imbalance issue, learning, objective, (15 more...)

arXiv.org Artificial Intelligence

2503.08006

Country:

Europe > Italy > Tuscany > Florence (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Fantastic Multi-Task Gradient Updates and How to Find Them In a Cone

Hassanpour, Negar, Janjua, Muhammad Kamran, Zhang, Kunlin, Lavasani, Sepehr, Zhang, Xiaowen, Zhou, Chunhua, Gao, Chao

arXiv.org Artificial IntelligenceJan-31-2025

Balancing competing objectives remains a fundamental challenge in multi-task learning (MTL), primarily due to conflicting gradients across individual tasks. A common solution relies on computing a dynamic gradient update vector that balances competing tasks as optimization progresses. Building on this idea, we propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a constrained optimization problem. Our method introduces an angular constraint to dynamically regulate gradient update directions, confining them within a cone centered on the reference gradient of the overall objective. By balancing task-specific gradients without over-constraining their direction or magnitude, ConicGrad effectively resolves inter-task gradient conflicts. Moreover, our framework ensures computational efficiency and scalability to high-dimensional parameter spaces. We conduct extensive experiments on standard supervised learning and reinforcement learning MTL benchmarks, and demonstrate that ConicGrad achieves state-of-the-art performance across diverse tasks.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2502.00217

Country:

North America > Canada (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.93)

Add feedback

Direction-oriented Multi-objective Learning: Simple and Provable Stochastic Algorithms

Xiao, Peiyao, Ban, Hao, Ji, Kaiyi

arXiv.org Machine LearningNov-28-2023

Multi-objective optimization (MOO) has become an influential framework in many machine learning problems with multiple objectives such as learning with multiple criteria and multi-task learning (MTL). In this paper, we propose a new direction-oriented multi-objective problem by regularizing the common descent direction within a neighborhood of a direction that optimizes a linear combination of objectives such as the average loss in MTL. This formulation includes GD and MGDA as special cases, enjoys the direction-oriented benefit as in CAGrad, and facilitates the design of stochastic algorithms. To solve this problem, we propose Stochastic Direction-oriented Multi-objective Gradient descent (SDMGrad) with simple SGD type of updates, and its variant SDMGrad-OS with an efficient objective sampling in the setting where the number of objectives is large. For a constant-level regularization parameter $\lambda$, we show that SDMGrad and SDMGrad-OS provably converge to a Pareto stationary point with improved complexities and milder assumptions. For an increasing $\lambda$, this convergent point reduces to a stationary point of the linear combination of objectives. We demonstrate the superior performance of the proposed methods in a series of tasks on multi-task supervised learning and reinforcement learning. Code is provided at https://github.com/ml-opt-lab/sdmgrad.

artificial intelligence, machine learning, objective, (16 more...)

arXiv.org Machine Learning

2305.18409

Genre: Research Report (0.81)

Industry: Education (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.35)

Add feedback

AdaTask: A Task-aware Adaptive Learning Rate Approach to Multi-task Learning

Yang, Enneng, Pan, Junwei, Wang, Ximei, Yu, Haibin, Shen, Li, Chen, Xihua, Xiao, Lei, Jiang, Jie, Guo, Guibing

arXiv.org Artificial IntelligenceMay-18-2023

Multi-task learning (MTL) models have demonstrated impressive results in computer vision, natural language processing, and recommender systems. Even though many approaches have been proposed, how well these approaches balance different tasks on each parameter still remains unclear. In this paper, we propose to measure the task dominance degree of a parameter by the total updates of each task on this parameter. Specifically, we compute the total updates by the exponentially decaying Average of the squared Updates (AU) on a parameter from the corresponding task.Based on this novel metric, we observe that many parameters in existing MTL methods, especially those in the higher shared layers, are still dominated by one or several tasks. The dominance of AU is mainly due to the dominance of accumulative gradients from one or several tasks. Motivated by this, we propose a Task-wise Adaptive learning rate approach, AdaTask in short, to separate the \emph{accumulative gradients} and hence the learning rate of each task for each parameter in adaptive learning rate approaches (e.g., AdaGrad, RMSProp, and Adam). Comprehensive experiments on computer vision and recommender system MTL datasets demonstrate that AdaTask significantly improves the performance of dominated tasks, resulting SOTA average task-wise performance. Analysis on both synthetic and real-world datasets shows AdaTask balance parameters in every shared layer well.

adatask, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2211.15055

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.47)

Technology: