AITopics | Gradient Descent

Collaborating Authors

Gradient Descent

News Overviews Instructional Materials AI-Alerts Classics

EXOTIC: An Exact, Optimistic, Tree-Based Algorithm for Min-Max Optimization

Maheshwari, Chinmay, Pimpalkhare, Chinmay, Chatterjee, Debasish

arXiv.org Artificial IntelligenceAug-19-2025

Min-max optimization arises in many domains such as game theory, adversarial machine learning, etc., with gradient-based methods as a typical computational tool. Beyond convex-concave min-max optimization, the solutions found by gradient-based methods may be arbitrarily far from global optima. In this work, we present an algorithmic apparatus for computing globally optimal solutions in convex-non-concave and non-convex-concave min-max optimization. For former, we employ a reformulation that transforms it into a non-concave-convex max-min optimization problem with suitably defined feasible sets and objective function. The new form can be viewed as a generalization of Sion's minimax theorem. Next, we introduce EXOTIC-an Exact, Optimistic, Tree-based algorithm for solving the reformulated max-min problem. EXOTIC employs an iterative convex optimization solver to (approximately) solve the inner minimization and a hierarchical tree search for the outer maximization to optimistically select promising regions to search based on the approximate solution returned by convex optimization solver. We establish an upper bound on its optimality gap as a function of the number of calls to the inner solver, the solver's convergence rate, and additional problem-dependent parameters. Both our algorithmic apparatus along with its accompanying theoretical analysis can also be applied for non-convex-concave min-max optimization. In addition, we propose a class of benchmark convex-non-concave min-max problems along with their analytical global solutions, providing a testbed for evaluating algorithms for min-max optimization. Empirically, EXOTIC outperforms gradient-based methods on this benchmark as well as on existing numerical benchmark problems from the literature. Finally, we demonstrate the utility of EXOTIC by computing security strategies in multi-player games with three or more players.

artificial intelligence, machine learning, optimization solver opt, (15 more...)

arXiv.org Artificial Intelligence

2508.12479

Country:

North America > United States (0.46)
Asia (0.28)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.74)

Add feedback

Main

Mahdavinia, Pouria

Neural Information Processing SystemsAug-18-2025, 22:18:39 GMT

We also conduct experiments supporting our theoretical results.

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > Pennsylvania (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.30)

Add feedback

From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent

Neural Information Processing SystemsAug-18-2025, 21:02:26 GMT

Stochastic Gradient Descent (SGD) has been the method of choice for learning large-scale non-convex models.

artificial intelligence, convergence, machine learning, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Learning a Single Neuron with Bias Using Gradient Descent Gal Vardi

Neural Information Processing SystemsAug-18-2025, 18:55:38 GMT

In this work, we study the common setting of learning a single neuron with respect to the squared loss, using gradient descent.

artificial intelligence, assumption, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.76)

Add feedback

A Experiment Details

Neural Information Processing SystemsAug-18-2025, 18:07:01 GMT

Given the differences between the training procedures of the model presented in Section 6.2, and those All models in Section 6.3 were trained with stochastic gradient descent on batches of size All models presented in this paper make use of the same 3-Layer MLP for parameterizing the encoders and decoders. This is then divided into 18 capsules, each of 18 dimensions. The decoder layers then have output sizes (450, 675, 4096). For all topographic models (TV AE and BubbleV AE) in Section 6.3, the global topographic organization afforded by These values were chosen to be sufficiently large to achieve notably lower equivariance error than the V AE baseline, and thus demonstrate the impact of topographic organization without temporal coherence. The results of all models are shown in Section B below.

artificial intelligence, machine learning, transformation, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Implicit_Regularization_ARXIV_2

Jiangyuan Li

Neural Information Processing SystemsAug-18-2025, 16:01:22 GMT

In this paper, we study the implicit bias of gradient descent for sparse regression.

artificial intelligence, gradient descent, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States > Texas > Brazos County > College Station (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Label Noise SGD Provably Prefers Flat Global Minimizers

Neural Information Processing SystemsAug-18-2025, 07:47:47 GMT

In overparametrized models, the noise in stochastic gradient descent (SGD) implicitly regularizes the optimization trajectory and determines which local minimum SGD converges to.

artificial intelligence, arxiv preprint arxiv, machine learning, (13 more...)

Neural Information Processing Systems

Country: