Can SGD Learn Recurrent Neural Networks with Provable Generalization?

Mar-19-2020, 00:48:20 GMT–Neural Information Processing Systems

Recurrent Neural Networks (RNNs) are among the most popular models in sequential data analysis. Yet, in the foundational PAC learning language, what concept class can it learn? Moreover, how can the same recurrent unit simultaneously learn functions from different input tokens to different output tokens, without affecting each other? In this paper, we show using the vanilla stochastic gradient descent (SGD), RNN can actually learn some notable concept class \emph{efficiently}, meaning that both time and sample complexity scale \emph{polynomially} in the input length (or almost polynomially, depending on the concept). This concept class at least includes functions where each output token is generated from inputs of earlier tokens using a smooth two-layer neural network. Papers published at the Neural Information Processing Systems Conference.

provable generalization, recurrent neural network, sgd learn recurrent neural network, (5 more...)

Neural Information Processing Systems

Mar-19-2020, 00:48:20 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Neural Networks > Deep Learning (0.66)
  - Statistical Learning > Gradient Descent (0.66)