Parallel SGD: When does averaging help?

Open in new window