AITopics | theory paper

Collaborating Authors

theory paper

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

dd45045f8c68db9f54e70c67048d32e8-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 18:28:30 GMT

assumption, limitation, section 4, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

We are very grateful to the reviewers for their helpful feedback and suggestions, and are pleased to have received a

Neural Information Processing SystemsAug-16-2025, 20:38:51 GMT

Our responses to the main concerns are given as follows. Section 4.4 for a related discussion and generalizations to non-unit norms. We would be happy to move some of the less central corollaries (e.g., Sections 4.2 and 4.5) to the We will also correct the typo in Line 202.

assumption, helpful feedback and suggestion, reviewer, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

Reviews: On the Convergence Rate of Training Recurrent Neural Networks

Neural Information Processing SystemsFeb-11-2025, 21:03:48 GMT

This paper shows that GD/SGD can minimize the training loss of RNNs with linear convergence rate assuming the hidden layer width is sufficiently large (polynomial in data size and time horizon length). In order to prove this, the authors show that within a small region around the initialization, the norm square of the gradient can be lower bounded by the function value (Theorem 3). The authors further show that the loss function is somewhat smooth (Theorem 4), which guarantees that moving in the negative gradient direction can decrease the function value. This paper builds new techniques to analyze multi-layer ReLU networks. This paper shows that with appropriate initialization, ReLU activations avoid exponential exploding and exponential vanishing.

initialization, step size, training recurrent neural network, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.40)

Add feedback

Reviews: Covariate-Powered Empirical Bayes Estimation

Neural Information Processing SystemsJan-23-2025, 11:24:04 GMT

This theory paper provides a number of novel results, including theoretical analysis of minimax bounds and an empirical analysis, for combinations of relatively simple statistical estimators and machine learning models of covariate information. The paper shows that these combinations improve on both the simple estimator alone and the machine learning model alone. The main concern raised by the reviewers is that the paper provides limited empirical validation. I disagree with this assessment, as the paper should be seen as a machine learning theory paper. As the proposed framework includes a number of advanced machine learning models, including XGBoost it should be very relevant for the NeurIPS community.

covariate-powered empirical bayes estimation, empirical validation, theory paper

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.69)

Add feedback