Blender Bot -- Part 3: The Many Architectures
We have been looking into Facebook's open-sourced conversational offering, Blender Bot. In Part-1 we went over in detail about the DataSets used in the pre-training and fine-tuning of it and the failure cases as well as limitations of Blender. And in Part-2 we studied the more generic problem setting of "Multi-Sentence Scoring", the Transformer architectures used for such a task and learnt about the Poly-Encoders in particular -- which will be used to provide the encoder representations in Blender. In this 3rd and final part, we return from our respite with Poly-Encoders, back to Blender. We shall go over the different Model Architectures, their respective training objectives, the Evaluation methods and performance of Blender in comparison to Meena.
Jul-3-2020, 01:45:45 GMT
- Technology: