Conversing with chatbots: DialoGPT
In a previous module, we examined language models and explored n-gram and neural approaches. We found that the n-gram approach is generally better for higher values of N but this may be constrained by available compute resources. There was also the concern about the lack of representation for n-grams not present in the training corpus. On the other hand, applying subword tokenization methods such as Byte Pair Encoding and Wordpiece, recent neural approaches are able to resolve the issues with n-gram language models and show impressive results. We also traced the development of neural language models from feedforward networks that rely on word embeddings and fixed input length to recurrent neural networks which allowed for variable length input but struggled to capture long term dependencies.
Jul-23-2021, 17:40:12 GMT
- Technology: