SGD vs GD: Rank Deficiency in Linear Networks

Open in new window