AITopics | sco problem

Collaborating Authors

sco problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gradient-Free Methods for Nonconvex Nonsmooth Stochastic Compositional Optimization

Neural Information Processing SystemsMar-20-2026, 13:25:50 GMT

The stochastic compositional optimization (SCO) is popular in many real-world applications, including risk management, reinforcement learning, and meta-learning. However, most of the previous methods for SCO require the smoothness assumption on both the outer and inner functions, which limits their applications to a wider range of problems. In this paper, we study the SCO problem in that both the outer and inner functions are Lipschitz continuous but possibly nonconvex and nonsmooth. In particular, we propose gradient-free stochastic methods for finding the $(\delta, \epsilon)$-Goldstein stationary points of such problems with non-asymptotic convergence rates. Our results also lead to an improved convergence rate for the convex nonsmooth SCO problem. Furthermore, we conduct numerical experiments to demonstrate the effectiveness of the proposed methods.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Gradient-FreeMethodsforNonconvexNonsmooth StochasticCompositionalOptimization

Neural Information Processing SystemsFeb-13-2026, 01:55:07 GMT

Most of the existing work [2, 5, 6, 7, 8] for nonconvex SCO problem is based on the assumption that both functionsf()andg()are smooth.

artificial intelligence, machine learning, optimization, (17 more...)

Neural Information Processing Systems

Country: Asia > Singapore (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

e64c9ec33f19c7de745bd6b6d1a7a86e-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 16:02:40 GMT

algorithm, sco problem, sgd, (15 more...)

Neural Information Processing Systems

Country: Europe > Czechia > Prague (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs

Neural Information Processing SystemsDec-25-2025, 03:11:47 GMT

Multi-epoch, small-batch, Stochastic Gradient Descent (SGD) has been the method of choice for learning with large over-parameterized models. A popular theory for explaining why SGD works well in practice is that the algorithm has an implicit regularization that biases its output towards a good solution. Perhaps the theoretically most well understood learning setting for SGD is that of Stochastic Convex Optimization (SCO), where it is well known that SGD learns at a rate of $O(1/\sqrt{n})$, where $n$ is the number of samples. In this paper, we consider the problem of SCO and explore the role of implicit regularization, batch size and multiple epochs for SGD. Our main contributions are threefold: * We show that for any regularizer, there is an SCO problem for which Regularized Empirical Risk Minimzation fails to learn.

batch-size and multiple-epoch, implicit regularization, sgd, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.60)

Add feedback

SGD: The Role of Implicit Regularization, Batch-size and Multiple Epochs

Neural Information Processing SystemsAug-18-2025, 07:45:43 GMT

Our main contributions are threefold: 1. We show that for any regularizer, there is an SCO problem for which Regularized Empirical Risk Minimzation fails to learn.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Czechia > Prague (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Gradient-Free Methods for Nonconvex Nonsmooth Stochastic Compositional Optimization

Neural Information Processing SystemsMay-27-2025, 01:08:18 GMT

The stochastic compositional optimization (SCO) is popular in many real-world applications, including risk management, reinforcement learning, and meta-learning. However, most of the previous methods for SCO require the smoothness assumption on both the outer and inner functions, which limits their applications to a wider range of problems. In this paper, we study the SCO problem in that both the outer and inner functions are Lipschitz continuous but possibly nonconvex and nonsmooth. In particular, we propose gradient-free stochastic methods for finding the (\delta, \epsilon) -Goldstein stationary points of such problems with non-asymptotic convergence rates. Our results also lead to an improved convergence rate for the convex nonsmooth SCO problem. Furthermore, we conduct numerical experiments to demonstrate the effectiveness of the proposed methods.

machine learning, nonconvex nonsmooth stochastic compositional optimization, reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs

Neural Information Processing SystemsJan-19-2025, 11:18:49 GMT

Multi-epoch, small-batch, Stochastic Gradient Descent (SGD) has been the method of choice for learning with large over-parameterized models. A popular theory for explaining why SGD works well in practice is that the algorithm has an implicit regularization that biases its output towards a good solution. Perhaps the theoretically most well understood learning setting for SGD is that of Stochastic Convex Optimization (SCO), where it is well known that SGD learns at a rate of O(1/\sqrt{n}), where n is the number of samples. In this paper, we consider the problem of SCO and explore the role of implicit regularization, batch size and multiple epochs for SGD. Our main contributions are threefold: * We show that for any regularizer, there is an SCO problem for which Regularized Empirical Risk Minimzation fails to learn.

batch-size and multiple-epoch, implicit regularization, sgd, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.63)

Add feedback

Speeding up Policy Simulation in Supply Chain RL

Farias, Vivek, Gijsbrechts, Joren, Khojandi, Aryan, Peng, Tianyi, Zheng, Andrew

arXiv.org Artificial IntelligenceJun-3-2024

Simulating a single trajectory of a dynamical system under some state-dependent policy is a core bottleneck in policy optimization algorithms. The many inherently serial policy evaluations that must be performed in a single simulation constitute the bulk of this bottleneck. To wit, in applying policy optimization to supply chain optimization (SCO) problems, simulating a single month of a supply chain can take several hours. We present an iterative algorithm for policy simulation, which we dub Picard Iteration. This scheme carefully assigns policy evaluation tasks to independent processes. Within an iteration, a single process evaluates the policy only on its assigned tasks while assuming a certain 'cached' evaluation for other tasks; the cache is updated at the end of the iteration. Implemented on GPUs, this scheme admits batched evaluation of the policy on a single trajectory. We prove that the structure afforded by many SCO problems allows convergence in a small number of iterations, independent of the horizon. We demonstrate practical speedups of 400x on large-scale SCO problems even with a single GPU, and also demonstrate practical efficacy in other RL environments.

iteration, picard iteration, speedup, (15 more...)

arXiv.org Artificial Intelligence

2406.01939

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Singapore (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Retail > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

skscope: Fast Sparsity-Constrained Optimization in Python

Wang, Zezhi, Zhu, Jin, Chen, Peng, Peng, Huiyang, Zhang, Xiaoke, Wang, Anran, Zheng, Yu, Zhu, Junxian, Wang, Xueqin

arXiv.org Machine LearningMar-27-2024

Applying iterative solvers on sparsity-constrained optimization (SCO) requires tedious mathematical deduction and careful programming/debugging that hinders these solvers' broad impact. In the paper, the library skscope is introduced to overcome such an obstacle. With skscope, users can solve the SCO by just programming the objective function. The convenience of skscope is demonstrated through two examples in the paper, where sparse linear regression and trend filtering are addressed with just four lines of code. More importantly, skscope's efficient implementation allows state-of-the-art solvers to quickly attain the sparse solution regardless of the high dimensionality of parameter space. Numerical experiments reveal the available solvers in skscope can achieve up to 80x speedup on the competing relaxation solutions obtained via the benchmarked convex solver.

skscope, solver, wang, (11 more...)

arXiv.org Machine Learning

2403.1854

Country: