Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts
–Neural Information Processing Systems
In this paper, we tackle the problem of domain shift. Most existing methods perform training on multiple source domains using a single model, and the same trained model is used on all unseen target domains. Such solutions are sub-optimal as each target domain exhibits its own specialty, which is not adapted. Furthermore, expecting single-model training to learn extensive knowledge from multiple source domains is counterintuitive. The model is more biased toward learning only domain-invariant features and may result in negative knowledge transfer.
Neural Information Processing Systems
Dec-24-2025, 18:12:10 GMT
- Technology: