minimizer
- North America > United States > California (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (6 more...)
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms Lam M. Nguyen
Stochastic gradient descent (SGD) algorithm is the method of choice in many machine learning tasks thanks to its scalability and efficiency in dealing with large-scale problems. In this paper, we focus on the shuffling version of SGD which matches the mainstream practical heuristics. We show the convergence to a global solution of shuffling SGD for a class of non-convex functions under over-parameterized settings.
- North America > United States > California (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (7 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.84)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.65)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)
- Asia > Russia (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Massachusetts (0.04)
- (4 more...)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (10 more...)
A Generalized Alternating Method for Bilevel
Bilevel optimization has recently regained interest owing to its applications in emerging machine learning fields such as hyperparameter optimization, meta-learning, and reinforcement learning. Recent results have shown that simple alternating (implicit) gradient-based algorithms can match the convergence rate of single-level gradient descent (GD) when addressing bilevel problems with a strongly convex lower-level objective. However, it remains unclear whether this result can be generalized to bilevel problems beyond this basic setting.
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (10 more...)
- Asia > Middle East > Israel (0.04)
- Asia > China (0.04)
- Asia > Middle East > Israel (0.04)
- Asia > China (0.04)
- North America > United States (0.14)
- Asia > Middle East > Jordan (0.04)