AITopics | sarsa algorithm

Finite-Sample Analysis for SARSA with Linear Function Approximation

Neural Information Processing SystemsDec-25-2025, 19:32:24 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.\ setting, where a single sample trajectory is available. With a Lipschitz continuous policy improvement operator that is smooth enough, SARSA has been shown to converge asymptotically. However, its non-asymptotic analysis is challenging and remains unsolved due to the non-i.i.d.

algorithm, finite-sample analysis, sarsa algorithm, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Shaofeng Zou, Tengyu Xu, Yingbin Liang

Neural Information Processing SystemsOct-3-2025, 08:27:49 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.

algorithm, behavior policy, sarsa algorithm, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > New York > Erie County > Buffalo (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.52)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Neural Information Processing SystemsOct-10-2024, 15:25:34 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.\ setting, where a single sample trajectory is available. With a Lipschitz continuous policy improvement operator that is smooth enough, SARSA has been shown to converge asymptotically. However, its non-asymptotic analysis is challenging and remains unsolved due to the non-i.i.d. In this paper, we develop a novel technique to explicitly characterize the stochastic bias of a type of stochastic approximation procedures with time-varying Markov transition kernels.

algorithm, linear function approximation, sarsa algorithm, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.64)

Add feedback

The State-Action-Reward-State-Action Algorithm in Spatial Prisoner's Dilemma Game

Yang, Lanyu, Jiang, Dongchun, Guo, Fuqiang, Fu, Mingjian

arXiv.org Artificial IntelligenceJun-25-2024

Cooperative behavior is prevalent in both human society and nature. Understanding the emergence and maintenance of cooperation among self-interested individuals remains a significant challenge in evolutionary biology and social sciences. Reinforcement learning (RL) provides a suitable framework for studying evolutionary game theory as it can adapt to environmental changes and maximize expected benefits. In this study, we employ the State-Action-Reward-State-Action (SARSA) algorithm as the decision-making mechanism for individuals in evolutionary game theory. Initially, we apply SARSA to imitation learning, where agents select neighbors to imitate based on rewards. This approach allows us to observe behavioral changes in agents without independent decision-making abilities. Subsequently, SARSA is utilized for primary agents to independently choose cooperation or betrayal with their neighbors. We evaluate the impact of SARSA on cooperation rates by analyzing variations in rewards and the distribution of cooperators and defectors within the network.

agent, algorithm, cooperation rate, (14 more...)

arXiv.org Artificial Intelligence

2406.17326

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Fujian Province > Fuzhou (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Games (0.89)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Obtain Employee Turnover Rate and Optimal Reduction Strategy Based On Neural Network and Reinforcement Learning

Cheng, Xiaohan

arXiv.org Artificial IntelligenceDec-1-2020

Nowadays, human resource is an important part of various resources of enterprises. For enterprises, high-loyalty and high-quality talented persons are often the core competitiveness of enterprises. Therefore, it is of great practical significance to predict whether employees leave and reduce the turnover rate of employees. First, this paper established a multi-layer perceptron predictive model of employee turnover rate. A model based on Sarsa which is a kind of reinforcement learning algorithm is proposed to automatically generate a set of strategies to reduce the employee turnover rate. These strategies are a collection of strategies that can reduce the employee turnover rate the most and cost less from the perspective of the enterprise, and can be used as a reference plan for the enterprise to optimize the employee system. The experimental results show that the algorithm can indeed improve the efficiency and accuracy of the specific strategy.

algorithm, employee turnover rate, turnover rate, (12 more...)

arXiv.org Artificial Intelligence

2012.00583

Country: Asia > China (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Finite-Sample Analysis for SARSA with Linear Function Approximation

Zou, Shaofeng, Xu, Tengyu, Liang, Yingbin

Neural Information Processing SystemsMar-19-2020, 00:03:45 GMT

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement learning. We investigate the SARSA algorithm with linear function approximation under the non-i.i.d.\ setting, where a single sample trajectory is available. With a Lipschitz continuous policy improvement operator that is smooth enough, SARSA has been shown to converge asymptotically. However, its non-asymptotic analysis is challenging and remains unsolved due to the non-i.i.d. In this paper, we develop a novel technique to explicitly characterize the stochastic bias of a type of stochastic approximation procedures with time-varying Markov transition kernels.

algorithm, linear function approximation, sarsa algorithm, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.42)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.64)

Add feedback

Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation

Zou, Shaofeng, Xu, Tengyu, Liang, Yingbin

arXiv.org Machine LearningFeb-6-2019

Though the convergence of major reinforcement learning algorithms has been extensively studied, the finite-sample analysis to further characterize the convergence rate in terms of the sample complexity for problems with continuous state space is still very limited. Such a type of analysis is especially challenging for algorithms with dynamically changing learning policies and under non-i.i.d.\ sampled data. In this paper, we present the first finite-sample analysis for the SARSA algorithm and its minimax variant (for zero-sum Markov games), with a single sample path and linear function approximation. To establish our results, we develop a novel technique to bound the gradient bias for dynamically changing learning policies, which can be of independent interest. We further provide finite-sample bounds for Q-learning and its minimax variant. Comparison of our result with the existing finite-sample bound indicates that linear function approximation achieves order-level lower sample complexity than the nearest neighbor approach.

algorithm, finite-sample analysis, function approximation, (10 more...)

arXiv.org Machine Learning

1902.02234

Country:

North America > United States > Ohio (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.84)

Add feedback

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

Sutton, Richard S.

Neural Information Processing SystemsDec-31-1996

On large problems, reinforcement learning systems must use parameterized function approximators such as neural networks in order to generalize between similar situations and actions. In these cases there are no strong theoretical results on the accuracy of convergence, and computational results have been mixed. In particular, Boyan and Moore reported at last year's meeting a series of negative results in attempting to apply dynamic programming together with function approximation to simple control problems with continuous state spaces. In this paper, we present positive results for all the control tasks they attempted, and for one that is significantly larger. The most important differences are that we used sparse-coarse-coded function approximators (CMACs) whereas they used mostly global function approximators, and that we learned online whereas they learned offline. Boyan and Moore and others have suggested that the problems they encountered could be solved by using actual outcomes ("rollouts"), as in classical Monte Carlo methods, and as in the TD().) algorithm when).

learning, reinforcement learning, sutton, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

Sutton, Richard S.

Neural Information Processing SystemsDec-31-1996

On large problems, reinforcement learning systems must use parameterized function approximators such as neural networks in order to generalize between similar situations and actions. In these cases there are no strong theoretical results on the accuracy of convergence, and computational results have been mixed. In particular, Boyan and Moore reported at last year's meeting a series of negative results in attempting to apply dynamic programming together with function approximation to simple control problems with continuous state spaces. In this paper, we present positive results for all the control tasks they attempted, and for one that is significantly larger. The most important differences are that we used sparse-coarse-coded function approximators (CMACs) whereas they used mostly global function approximators, and that we learned online whereas they learned offline. Boyan and Moore and others have suggested that the problems they encountered could be solved by using actual outcomes ("rollouts"), as in classical Monte Carlo methods, and as in the TD().) algorithm when).

learning, reinforcement learning, sutton, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

Sutton, Richard S.

Neural Information Processing SystemsDec-31-1996

On large problems, reinforcement learning systems must use parameterized functionapproximators such as neural networks in order to generalize between similar situations and actions. In these cases there are no strong theoretical results on the accuracy of convergence, and computational resultshave been mixed. In particular, Boyan and Moore reported at last year's meeting a series of negative results in attempting to apply dynamic programming together with function approximation to simple control problems with continuous state spaces. In this paper, we present positive results for all the control tasks they attempted, and for one that is significantly larger. The most important differences are that we used sparse-coarse-coded function approximators (CMACs) whereas they used mostly global function approximators, and that we learned online whereas they learned offline. Boyan and Moore and others have suggested that the problems they encountered could be solved by using actual outcomes ("rollouts"), as in classical Monte Carlo methods, and as in the TD().) algorithm when).

artificial intelligence, machine learning, reinforcement learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

sarsa algorithm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Finite-Sample Analysis for SARSA with Linear Function Approximation

Finite-Sample Analysis for SARSA with Linear Function Approximation

Finite-Sample Analysis for SARSA with Linear Function Approximation

The State-Action-Reward-State-Action Algorithm in Spatial Prisoner's Dilemma Game

Obtain Employee Turnover Rate and Optimal Reduction Strategy Based On Neural Network and Reinforcement Learning

Finite-Sample Analysis for SARSA with Linear Function Approximation

Finite-Sample Analysis for SARSA and Q-Learning with Linear Function Approximation

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding

Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding