Reviews: On Multiplicative Integration with Recurrent Neural Networks
–Neural Information Processing Systems
My biggest concern about this work is the lack of novelty. Despite the claimed differences, the proposed method is a special case of what proposed in [10]. I doubt that the slight different parameterization (remove one factor-hidden matrix and introduce more bias terms) makes much difference. I strongly suspect that the improved performance is due to better optimization (HF has proven to be very brittle). I also found weak the argument for which gating makes gradients flow better because there is no guarantee this is going to happen.
Neural Information Processing Systems
Jan-20-2025, 22:45:58 GMT
- Technology: