Reviewer # 1 Thanks for the comments!

Oct-3-2025, 03:19:13 GMT–Neural Information Processing Systems

We should clarify that the theoretical results already consider out-of-sample generalization. There are connections between this work and entropy regularized RL, but there are also distinctions. This allows us to prove new generalization bounds in the form of Theorem 8. We are also able to We also have the same suite of results prepared for MNIST, and standard deviations for Table 1. "It is unclear to me if the reward estimation algorithm is actually evaluated in the experiments." Y es, Section 3.6 used "Can you comment on the increased variance demonstrated by Composite on T able 2?" To produce Table 2, "I find curious that [...] all the experiments consists of classification tasks "reworked" [...]." Criteo dataset is a benchmark in this area, which has been extracted from a real online advertising challenge.

artificial intelligence, machine learning, reviewer, (17 more...)

Neural Information Processing Systems

Oct-3-2025, 03:19:13 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.93)

Duplicate Docs Excel Report

Title
84899ae725ba49884f4c85c086f1b340-AuthorFeedback.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found