Rolling the AI dice: Novel replication and stochastic parrots -- Navigate AI
Things get dicey if we ask NLP models to fend entirely for themselves in unattended dialogue with humans. At its recent I/O developer conference, Google showcased its Language Model for Dialogue Applications (LaMDA), a (very) large language model trained on dialogue and optimised to reproduce the "sensibleness", "specificity" and "interestingness" of human conversation. The demo is super-impressive – we've moved way on from the 2019 (only two years ago!) leading language model GPT-2 which frequently produced nonsensical content because it didn't have an "understanding" of the world it was describing (I remember fires breaking out underwater in one of its generated stories). They are, nonetheless, essentially still probabilistic models generating new text word-by-word enslaved to the patterns, no matter how nuanced, seen in training data – what I call "novel replication", but what others have less charitably described as "stochastic parrots". Although top-down statistical analyses and careful cleansing of the training data can minimise biases and instances of hate speech, they can nonetheless still get things monumentally wrong.
Oct-9-2021, 21:05:35 GMT
- Technology: