Improving Deep Knowledge Tracing via Gated Architectures and Adaptive Optimization