Meta-DMoE: Adapting to Domain Shift by Meta-Distillation from Mixture-of-Experts

Dec-24-2025, 18:12:10 GMT–Neural Information Processing Systems

In this paper, we tackle the problem of domain shift. Most existing methods perform training on multiple source domains using a single model, and the same trained model is used on all unseen target domains. Such solutions are sub-optimal as each target domain exhibits its own specialty, which is not adapted. Furthermore, expecting single-model training to learn extensive knowledge from multiple source domains is counterintuitive. The model is more biased toward learning only domain-invariant features and may result in negative knowledge transfer.

meta-distillation, meta-dmoe, mixture-of-expert, (10 more...)

Neural Information Processing Systems

Dec-24-2025, 18:12:10 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.83)