[D] seq2seq why use cross entropy loss? • r/MachineLearning

Open in new window