Goto

Collaborating Authors

 Transfer Learning


Reviews: Catastrophic Forgetting Meets Negative Transfer: Batch Spectral Shrinkage for Safe Transfer Learning

Neural Information Processing Systems

The reviews as well as the rebuttal have generated interesting discussions about the aspects of transfer learning and domain adaptation discussed in this paper. Although there is not a clear consensus (one reviewer is oscillating between a weak reject and a weak accept), I found both the paper and the comments of the skeptical reviewer (Reviewer 3) were relevant. Thus, I believe that this contribution is worth presenting at the conference since it can inspire significant further developments.


Transfer Learning via Minimizing the Performance Gap Between Domains

Neural Information Processing Systems

We propose a new principle for transfer learning, based on a straightforward intuition: if two domains are similar to each other, the model trained on one domain should also perform well on the other domain, and vice versa. To formalize this intuition, we define the performance gap as a measure of the discrepancy between the source and target domains. We derive generalization bounds for the instance weighting approach to transfer learning, showing that the performance gap can be viewed as an algorithm-dependent regularizer, which controls the model complexity. Our theoretical analysis provides new insight into transfer learning and motivates a set of general, principled rules for designing new instance weighting schemes for transfer learning. These rules lead to gapBoost, a novel and principled boosting approach for transfer learning.


Reviews: On the Value of Target Data in Transfer Learning

Neural Information Processing Systems

Why do we care about transfer learning in the first place? And possibly give a short explanation: what is transfer learning. You could also refer to a survey paper for readers that are interested.


Reviews: On the Value of Target Data in Transfer Learning

Neural Information Processing Systems

This is a solid paper on the generalization theory of transfer learning. To ease the readability and extract the main messages of the paper, the authors should take time to better discuss the technical concepts they manipulate.


Review for NeurIPS paper: Hierarchical Granularity Transfer Learning

Neural Information Processing Systems

Summary and Contributions: The paper proposes a new task named Hierarchical Granularity Transfer Learning (HGTL) and a new network architecture called Bi-granularity Semantic Preserving Network (BigSPN). HGTL has only basic category labels and semantic descriptions for hierarchical categories. The goal is to recognize sub-category levels without annotations for sub-category levels. In this paper, 2 levels (basic, subordinate) are considered. Semantic descriptions are typically attributes, keywords or text descriptions.


Review for NeurIPS paper: Hierarchical Granularity Transfer Learning

Neural Information Processing Systems

R1 and R3 comment that the paper lacks mathematical grounding and novelty. However, R2 and R4 both think that the paper proposes an interesting and useful task and could be adopted by vision researchers. I think the paper should be accepted.


Review for NeurIPS paper: Learning to Learn Variational Semantic Memory

Neural Information Processing Systems

Correctness: As mentioned above, I am a bit skeptical about the technical correctness for the variational inference framework. Specifically, - I think the latent z in Eq.(2) does not properly represent the class prototypes as z is conditioned on each individual x, not a entire class set (But on the other hand, Figure 1 shows that the latent z is conditioned on each of the class sets, and I'm confused which one is right). I don't understand how the approximate posterior q(z S) can have dependency on S, because according to the generative process defined by Eq.(2), the true posterior p(z x,y) does not have the dependency on the entire class set S except for each individual point (x,y). If it is not included, then the inference of m should be based on semi-implicit variational inference [2,3] as the intermediate stochastic variable m is only for the approximate posterior. However, such a discussion has not been discussed in the paper and the ELBO expression Eq.(13) seems not to represent the SIVI procedure as well.


Review for NeurIPS paper: On the Theory of Transfer Learning: The Importance of Task Diversity

Neural Information Processing Systems

Maybe the author can discuss what happens with moderate model misspecification. The theory does not explain why transfer learning works when training tasks are not diverse. In all three examples, the'classifier head' hypothesis class F is linear. I wonder what task-diversity constants (definition 3) can be derived for more complex family F such as a multi-layer neural network. How about logistic loss, or classification? 5. Question: Can more refined bounds than [1] be applied to deep neural networks?


Review for NeurIPS paper: On the Theory of Transfer Learning: The Importance of Task Diversity

Neural Information Processing Systems

The reviewers reached a consensus that the paper can be accepted to NeurIPS. One additional comment from the meta-reviewer is that because the paper doesn't have any experimental component, it's unclear whether the message in the title "The Importance of Task Diversity" is sufficiently justified. It's true that the theory need the assumption of diverse task, but it's unclear why that's the most important one amont many of the assumptions.