Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function
–Neural Information Processing Systems
Consequently, the iteration of SGD, unlike GD, is not deterministic even when it is started at a fixed initial condition.
Neural Information Processing Systems
Oct-2-2025, 08:27:06 GMT
- Country:
- Asia
- China (0.04)
- Middle East
- Russia (0.04)
- Europe
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Russia > Central Federal District
- North America > Canada (0.04)
- Asia
- Technology: