ARelationshipwithEvolutionStrategies(ES) Inthemainpaper,werestrictthegradienttotherandombase
–Neural Information Processing Systems
Formally,this constraint also applies to special cases of Natural Evolution Strategies [37, 3]. Similar estimators can be obtained for other symmetric distributions with finite second moment. Moreover,theadditionalhyperparameter σ that determines the magnitude of the perturbation needs to be carefully chosen [33]. Figure B.7: Validation accuracy after 100 epochs and mean gradient correlation with SGD plotted against increasing subspace dimensionality d on the CIFAR-10 CNN (average of three runs). As expected, the mean cosine similarity across 100 pairs of random vectors decreases with growing dimensionality.
Neural Information Processing Systems
Feb-9-2026, 07:56:16 GMT
- Technology: