MiMuon: Mixed Muon Optimizer with Improved Generalization for Large Models

Open in new window