[D] How to deal with non-Markovian decision processes with large/infinite horizon using MCTS? • r/MachineLearning

@machinelearnbot 

Quick google search will tell you that MCTS is applicable to large/infinite horizon RL tasks. But it seems that there's no empirical confirmation that it works as well as on Go. Assume that no rollout is used just as in AlphaZero. Go's state space is larger than other games, but its horizon length is small (not much larger than 100 timesteps). The state space of many real-world problems grows exponentially w.r.t. the timestep in the following sense.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found