Reviews: Modular Universal Reparameterization: Deep Multi-task Learning Across Diverse Domains
–Neural Information Processing Systems
I have consequently increased my score. The paper proposes to decompose the parameters into L distinct parameter blocks. Each of these blocks is seen as solving a "pseudo-task", learning a linear map from inputs to outputs. The parameters of these blocks are generated by K hypermodules (small hypernetworks) that condition on a context vector for each pseudo-task based. The alignment of hypermodules to pseudo-tasks is governed by a softmax function and learned during training similar to mixture-of-experts.
Neural Information Processing Systems
Jan-25-2025, 07:12:01 GMT
- Technology: