Stochastic Gradient Descent Revisited
The advent of artificial intelligence (AI) has been rendered possible by the spectacular acceleration of computing chip capacity over the last few decades, and has driven a technological revolution that has not spared any aspect of life, including healthcare, supply chain management, social media, etc. AI describes a set of machine learning methods that abandon any form of structural representation of data and look instead into uncovering data patterns to produce probabilistic relationships between input and output quantities of interest. While it has significantly improved people's standards of living, AI has nevertheless engendered many operational risks (e.g. by producing undesirable or unexpected outcomes) as well as systemic risks (e.g. the "Flash Crash", whereby a blue-chip company's share price suddenly plummeted and bounced back in the span of minutes [KL13]). To better manage, prevent and mitigate such risks, some level of mathematical insight must be brought in to shed light onto the inner workings of AI, in order to allow practitioners and regulators alike to act upon it in order to increase its efficiency and curb its shortcomings. SGD is the engine of AI, making it a natural stepping stone toward mathematically explaining AI. Indeed, to capture their intricacies, machine learning problems are often modeled using wide and highly parametrized neural networks [GBC16], which are then solved using SGD or an adaptive variant thereof, namely Adagrad, Adadelta, RMSProp, Adamax or Adam [Rud17]. To approximate a stationary point of a given loss landscape (also referred to as objective or cost function [LZB22; AL24; AMA05]), SGD recursively spawns a trajectory of iterates by factoring in, at each step, a stochastic gradient modulated by a positive learning rate. Whereas classical SGD literature provides convergence guarantees and convergence rates within a (strongly) convex framework [Duf96; BV04; RM51], machine learning models are often highly nonconvex and require new SGD frameworks to better understand and parametrize them.
Dec-8-2024
- Country:
- Europe
- Italy > Emilia-Romagna
- Metropolitan City of Bologna > Bologna (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Italy > Emilia-Romagna
- Europe
- Genre:
- Research Report (0.50)
- Industry:
- Banking & Finance > Trading (0.67)
- Technology: