On Linear Mode Connectivity of Mixture-of-Experts Architectures

Open in new window