Towards Binary-Valued Gates for Robust LSTM Training