Goto

Collaborating Authors

 longpre


Researchers Propose a Better Way to Report Dangerous AI Flaws

WIRED

In late 2023, a team of third party researchers discovered a troubling glitch in OpenAI's widely used artificial intelligence model GPT-3.5. When asked to repeat certain words a thousand times, the model began repeating the word over and over, then suddenly switched to spitting out incoherent text and snippets of personal information drawn from its training data, including parts of names, phone numbers, and email addresses. The team that discovered the problem worked with OpenAI to ensure the flaw was fixed before revealing it publicly. It is just one of scores of problems found in major AI models in recent years. In a proposal released today, more than 30 prominent AI researchers, including some who found the GPT-3.5 flaw, say that many other vulnerabilities affecting popular models are reported in problematic ways.


This is where the data to build AI comes from

MIT Technology Review

Their findings, shared exclusively with MIT Technology Review, show a worrying trend: AI's data practices risk concentrating power overwhelmingly in the hands of a few dominant technology companies. In the early 2010s, data sets came from a variety of sources, says Shayne Longpre, a researcher at MIT who is part of the project. It came not just from encyclopedias and the web, but also from sources such as parliamentary transcripts, earning calls, and weather reports. Back then, AI data sets were specifically curated and collected from different sources to suit individual tasks, Longpre says. Then transformers, the architecture underpinning language models, were invented in 2017, and the AI sector started seeing performance get better the bigger the models and data sets were.