[D] seq2seq why use cross entropy loss? • r/MachineLearning

@machinelearnbot 

If we use word embedding in our seq2seq model, why don't we just use the distance between 2 vectors as a loss function instead of softmax cross entropy?

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found