Appendix A Different Quality Suggester Results
–Neural Information Processing Systems
This section presents results on RockSample (8, 4, 10, 1) when the suggester is not always all-knowing. In our approach, we formulated the belief update based on assuming the suggester observed the environment. These results demonstrate that our approach extends beyond an all-knowing suggester and can incorporate information from suggestions developed from different beliefs of the state. Table 3 contains the mean rewards and table 4 contains the mean number of suggestions considered by the agent. The details of the agents are provided in section 4.2.
Neural Information Processing Systems
Aug-19-2025, 07:50:40 GMT
- Technology: