Muon in Associative Memory Learning: Training Dynamics and Scaling Laws

Open in new window