MomentUm Orthogonalized by Newton-Schulz
train faster and better with less gpus
Science
Technology
Introduction to Muon optimizer, its concept, design, and algorithm
This is my introduction to MomentUm Orthogonalized by Newton-Schulz Optimizer. You can find my slides and the recorded video below.