The most sophisticated AI language models, like OpenAI's GPT-3, can perform tasks from generating code to drafting marketing copy. But many of the underlying mechanisms remain opaque, making these models prone to unpredictable -- and sometimes toxic -- behavior. As recent research has shown, even careful calibration can't always prevent language models from making sexist associations or endorsing conspiracies. Newly proposed explainability techniques promise to make language models more transparent than before. While they aren't silver bullets, they could be the building blocks for less problematic models -- or at the very least models that can explain their reasoning.
Dec-30-2021, 18:20:19 GMT