Goto

Collaborating Authors

 adashare





AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Neural Information Processing Systems

Multi-task learning is an open and challenging problem in computer vision. The typical way of conducting multi-task learning with deep neural networks is either through handcrafted schemes that share all initial layers and branch out at an adhoc point, or through separate task-specific networks with an additional feature sharing/fusion mechanism. Unlike existing methods, we propose an adaptive sharing approach, calledAdaShare, that decides what to share across which tasks to achieve the best recognition accuracy, while taking resource efficiency into account. Specifically, our main idea is to learn the sharing pattern through a task-specific policy that selectively chooses which layers to execute for a given task in the multi-task network.


AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (Supplementary Material) Ximeng Sun 1 Rameswar Panda

Neural Information Processing Systems

Tiny Taskonomy consists of 381,840 indoor images from 35 buildings with annotations available for 26 tasks. It is one of the large-scale domain adaptation benchmark with 0.6m images across six Our training is separated into two phases: the Policy Learning Phase and the Re-training Phase. In both phases, we use the early stop to get the best performance during the training. We use the same parameter set for our model and baselines. For Cross-Stitch and Sluice, we insert the linear feature fusion layers after each residual block.




Review for NeurIPS paper: AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Neural Information Processing Systems

Weaknesses: Despite its effectiveness, there are certain aspects that I am concerned about, as following: (1) The idea of layer dropping is not new. It has been explored for regularization [1] as well as structured pruning [2, 3]. In addition, methods of routing subnetwork and learning task-specific params for each task [4] have also been studied before. It would be more comprehensive to additionally test on a dataset with larger numbers of tasks. First, I think that the method's main advantage is not memory efficiency since it is at least as large as a standard MTL network (denoted as Multi-Task in the paper).


Review for NeurIPS paper: AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Neural Information Processing Systems

A good paper that presents a new multitask learning method. The reviewers agree that the paper is well written and the results support the main claim of the paper. The reviewers have some concerns regarding the applicability of the proposed method to other domains. It would be good if the authors can address this in the revised version of the paper.


AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Neural Information Processing Systems

Multi-task learning is an open and challenging problem in computer vision. The typical way of conducting multi-task learning with deep neural networks is either through handcrafted schemes that share all initial layers and branch out at an adhoc point, or through separate task-specific networks with an additional feature sharing/fusion mechanism. Unlike existing methods, we propose an adaptive sharing approach, calledAdaShare, that decides what to share across which tasks to achieve the best recognition accuracy, while taking resource efficiency into account. Specifically, our main idea is to learn the sharing pattern through a task-specific policy that selectively chooses which layers to execute for a given task in the multi-task network. Experiments on several challenging and diverse benchmark datasets with a variable number of tasks well demonstrate the efficacy of our approach over state-of-the-art methods.