lstm input
[D] Backpropagating to LSTM inputs!
Hi, I'm trying an architecture that is a sort of autoencoder, where the encoded representation is a string. In order to deal with differentiability issues, I'm not actually encoding it as a string, but as the softmax of the output of the encoder LSTM. Then, this tensor is fed into the decoder LSTM. However, I am noticing a huge difference (of the order of 10 3 or 10 4) between the grads calculated on the outputs of the decoder LSTM and the inputs during backpropagation. That is, it seems that the LSTM barely propagates back to the input sequence.
Interpreting LSTM Prediction on Solar Flare Eruption with Time-series Clustering
Sun, Hu, Manchester, Ward, Jiao, Zhenbang, Wang, Xiantong, Chen, Yang
We conduct a post hoc analysis of solar flare predictions made by a Long Short Term Memory (LSTM) model employing data in the form of Space-weather HMI Active Region Patches (SHARP) parameters. These data are distinguished in that the parameters are calculated from data in proximity to the magnetic polarity inversion line where the flares originate. We train the the LSTM model for binary classification to provide a prediction score for the probability of M/X class flares to occur in next hour. We then develop a dimension-reduction technique to reduce the dimensions of SHARP parameter (LSTM inputs) and demonstrate the different patterns of SHARP parameters corresponding to the transition from low to high prediction score. Our work shows that a subset of SHARP parameters contain the key signals that strong solar flare eruptions are imminent. The dynamics of these parameters have a highly uniform trajectory for many events whose LSTM prediction scores for M/X class flares transition from very low to very high. The results suggest that there exist a few threshold values of a subset of SHARP parameters when surpassed could indicate a high probability of strong flare eruption. Our method has distilled the knowledge of solar flare eruption learnt by deep learning model and provides a more interpretable approximation where more physics related insights could be derived.
- South America > Brazil > Rio de Janeiro > South Atlantic Ocean (0.04)
- North America > United States > Michigan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China > Beijing > Beijing (0.04)
r/MachineLearning - [D] Question Regarding LSTM Input
I'm trying to train an LSTM to generate song lyrics. For my input data, I downloaded a bunch of song lyrics (1-D list where each entry is one line of a song) and used Keras Tokenization w one-hot. This is is where I'm having trouble setting up the structure. Once I have converted the lyric-lines to one-hot, what should the input to the LSTM look like? When I fit the model, what should I use for my target?