Review for NeurIPS paper: Provably Efficient Neural GTD for Off-Policy Learning

Jan-25-2025, 21:51:56 GMT–Neural Information Processing Systems

Weaknesses: The philosophy of establishing convergence guarantees for neural networks under specific numbers of neurons is strange, because the number of neurons is a very coarse description of a network that can already be established by nonparametric estimators, i.e., Cho, Youngmin, and Lawrence K. Saul. And numerous follow up works. Therefore, if the neural network analysis is to refine this approach, then it must also specify the *inter-layer* relationships and broader architectural choices to actually be useful to practitioners. As is, I don't see how the m of Lemma 4.1 can actually be used to inform choice of a neural architecture in any sharper manner than, e.g., a single layer RBF network. Also, reformulating Bellman's equations into saddle point problems has been previously studied: Shapiro, A. (2011).

learning, off-policy learning, provably efficient neural gtd, (7 more...)

Neural Information Processing Systems

Jan-25-2025, 21:51:56 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)