AITopics | neural-q-whittle

5c7c66dfc9f93f0c738947f3b1c13832-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 08:47:42 GMT

data mining, machine learning, reinforcement learning, (20 more...)

Neural Information Processing Systems

Country:

Oceania > New Zealand (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)

Industry:

Health & Medicine (0.67)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation

Neural Information Processing SystemsDec-25-2025, 12:27:58 GMT

Whittle index policy is a heuristic to the intractable restless multi-armed bandits (RMAB) problem. Although it is provably asymptotically optimal, finding Whittle indices remains difficult. In this paper, we present Neural-Q-Whittle, a Whittle index based Q-learning algorithm for RMAB with neural network function approximation, which is an example of nonlinear two-timescale stochastic approximation with Q-function values updated on a faster timescale and Whittle indices on a slower timescale. Despite the empirical success of deep Q-learning, the non-asymptotic convergence rate of Neural-Q-Whittle, which couples neural networks with two-timescale Q-learning largely remains unclear. This paper provides a finite-time analysis of Neural-Q-Whittle, where data are generated from a Markov chain, and Q-function is approximated by a ReLU neural network. Our analysis leverages a Lyapunov drift approach to capture the evolution of two coupled parameters, and the nonlinearity in value function approximation further requires us to characterize the approximation error. Combing these provide Neural-Q-Whittle with $\mathcal{O}(1/k^{2/3})$ convergence rate, where $k$ is the number of iterations.

finite-time analysis, neural-q-whittle, restless multi-armed bandit, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.77)

Add feedback

5c7c66dfc9f93f0c738947f3b1c13832-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 18:28:59 GMT

data mining, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Country:

Oceania > New Zealand (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)

Industry:

Health & Medicine (0.67)
Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation

Neural Information Processing SystemsJan-18-2025, 15:28:43 GMT

Whittle index policy is a heuristic to the intractable restless multi-armed bandits (RMAB) problem. Although it is provably asymptotically optimal, finding Whittle indices remains difficult. In this paper, we present Neural-Q-Whittle, a Whittle index based Q-learning algorithm for RMAB with neural network function approximation, which is an example of nonlinear two-timescale stochastic approximation with Q-function values updated on a faster timescale and Whittle indices on a slower timescale. Despite the empirical success of deep Q-learning, the non-asymptotic convergence rate of Neural-Q-Whittle, which couples neural networks with two-timescale Q-learning largely remains unclear. This paper provides a finite-time analysis of Neural-Q-Whittle, where data are generated from a Markov chain, and Q-function is approximated by a ReLU neural network.

neural network function approximation, neural-q-whittle, restless multi-armed bandit, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.70)

Add feedback

Finite-Time Analysis of Whittle Index based Q-Learning for Restless Multi-Armed Bandits with Neural Network Function Approximation

Xiong, Guojun, Li, Jian

arXiv.org Artificial IntelligenceOct-3-2023

Whittle index policy is a heuristic to the intractable restless multi-armed bandits (RMAB) problem. Although it is provably asymptotically optimal, finding Whittle indices remains difficult. In this paper, we present Neural-Q-Whittle, a Whittle index based Q-learning algorithm for RMAB with neural network function approximation, which is an example of nonlinear two-timescale stochastic approximation with Q-function values updated on a faster timescale and Whittle indices on a slower timescale. Despite the empirical success of deep Q-learning, the non-asymptotic convergence rate of Neural-Q-Whittle, which couples neural networks with two-timescale Q-learning largely remains unclear. This paper provides a finite-time analysis of Neural-Q-Whittle, where data are generated from a Markov chain, and Q-function is approximated by a ReLU neural network. Our analysis leverages a Lyapunov drift approach to capture the evolution of two coupled parameters, and the nonlinearity in value function approximation further requires us to characterize the approximation error. Combing these provide Neural-Q-Whittle with $\mathcal{O}(1/k^{2/3})$ convergence rate, where $k$ is the number of iterations.

approximation, neural-q-whittle, whittle index, (15 more...)

arXiv.org Artificial Intelligence

2310.02147

Country: