AITopics | switching cost

In many sequential decision making applications, the change of decision would bring an additional cost, such as the wear-and-tear cost associated with changing server status. To control the switching cost, we introduce the problem of online convex optimization with continuous switching constraint, where the goal is to achieve a small regret given a budget on the overall switching cost. We first investigate the hardness of the problem, and provide a lower bound of orderΩ( T)whentheswitchingcostbudgetS = Ω( T),andΩ(min{T/S,T}) whenS = O( T), where T is the time horizon. The essential idea is to carefully design an adaptive adversary, who can adjust the loss function according to thecumulative switchingcostofthe playerincurredso farbasedonthe orthogonal technique. We then develop a simple gradient-based algorithm which enjoys the minimax optimal regret bound.

artificial intelligence, constraint, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

f08b7ac8aa30a2a9ab34394e200e1a71-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 20:32:48 GMT

constraint, convex optimization, optimization, (15 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > Iowa > Johnson County > Iowa City (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)

Add feedback

9bcd1fa0c05e5f25ba7a1261f1852e82-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 00:18:05 GMT

algorithm, log 2, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ad71c82b22f4f65b9398f76d8be4c615-Paper.pdf

Neural Information Processing SystemsFeb-9-2026, 19:45:48 GMT

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
North America > Canada (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

ProvablyEfficientReinforcementLearningwith LinearFunctionApproximationunderAdaptivity Constraints

Neural Information Processing SystemsFeb-9-2026, 07:55:23 GMT

Real-world reinforcement learning (RL) applications often come with possibly infinite state and action space, and in such a situation classical RL algorithms developed in the tabular setting are not applicable anymore. A popular approach to overcoming this issue is by applying function approximation techniques to the underlying structures of the Markovdecision processes (MDPs).

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)

Add feedback

ProvablyEfficientReinforcementLearningwith LinearFunctionApproximationunderAdaptivity Constraints

Neural Information Processing SystemsFeb-9-2026, 07:55:19 GMT

Real-world reinforcement learning (RL) applications often come with possibly infinite state and action space, and in such a situation classical RL algorithms developed in the tabular setting are not applicable anymore. A popular approach to overcoming this issue is by applying function approximation techniques to the underlying structures of the Markovdecision processes (MDPs).

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)

Add feedback

Optimal Comparator Adaptive Online Learning with Switching Cost

Neural Information Processing SystemsDec-24-2025, 20:32:14 GMT

Practical online learning tasks are often naturally defined on unconstrained domains, where optimal algorithms for general convex losses are characterized by the notion of comparator adaptivity. In this paper, we design such algorithms in the presence of switching cost - the latter penalizes the typical optimism in adaptive algorithms, leading to a delicate design trade-off. Based on a novel dual space scaling strategy discovered by a continuous-time analysis, we propose a simple algorithm that improves the existing comparator adaptive regret bound [ZCP22a] to the optimal rate. The obtained benefits are further extended to the expert setting, and the practicality of the proposed algorithm is demonstrated through a sequential investment task.

algorithm, name change, optimal comparator adaptive online learning, (3 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.81)

Add feedback

Better Best of Both Worlds Bounds for Bandits with Switching Costs

Neural Information Processing SystemsDec-24-2025, 08:43:33 GMT

We study best-of-both-worlds algorithms for bandits with switching cost, recently addressed by Rouyer et al., 2021. We introduce a surprisingly simple and effective algorithm that simultaneously achieves minimax optimal regret bound (up to logarithmic factors) of $\mathcal{O}(T^{2/3})$ in the oblivious adversarial setting and a bound of $\mathcal{O}(\min\{\log (T)/\Delta^2,T^{2/3}\})$ in the stochastically-constrained regime, both with (unit) switching costs, where $\Delta$ is the gap between the arms. In the stochastically constrained case, our bound improves over previous results due to Rouyer et al., 2021, that achieved regret of $\mathcal{O}(T^{1/3}/\Delta)$. We accompany our results with a lower bound showing that, in general, $\tilde{\mathcal{\Omega}}(\min\{1/\Delta^2,T^{2/3}\})$ switching cost regret is unavoidable in the stochastically-constrained case for algorithms with $\mathcal{O}(T^{2/3})$ worst-case switching cost regret.

mathcal, name change, world bound, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Filters

Collaborating Authors

switching cost

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

f08b7ac8aa30a2a9ab34394e200e1a71-Supplemental.pdf

Provably Efficient Q-Learning with Low Switching Cost

OnlineConvexOptimization withContinuousSwitchingConstraint

f08b7ac8aa30a2a9ab34394e200e1a71-Paper.pdf

9bcd1fa0c05e5f25ba7a1261f1852e82-Paper-Conference.pdf

ad71c82b22f4f65b9398f76d8be4c615-Paper.pdf

ProvablyEfficientReinforcementLearningwith LinearFunctionApproximationunderAdaptivity Constraints

ProvablyEfficientReinforcementLearningwith LinearFunctionApproximationunderAdaptivity Constraints

Optimal Comparator Adaptive Online Learning with Switching Cost

Better Best of Both Worlds Bounds for Bandits with Switching Costs