When Tay started its short digital life on March 23, it just wanted to gab and make some new friends on the net. The chatbot, which was created by Microsoft's Research department, greeted the day with an excited tweet that could have come from any teen: "hellooooooo w rld!!!" Within a few hours, though, Tay's optimistic, positive tone had changed. "Hitler was right I hate the jews," it declared in a stream of racist tweets bashing feminism and promoting genocide. Concerned about their bot's rapid radicalization, Tay's creators shut it down after less than 24 hours of existence. Microsoft had unwittingly lowered their burgeoning artificial intelligence into -- to use the parlance of the very people who corrupted her -- a virtual dumpster fire.
As data scientists, we are aware that bias exists in the world. We read up on stories about how cognitive biases can affect decision-making. We know that, for instance, a resume with a white-sounding name will receive a different response than the same resume with a black-sounding name, and that writers of performance reviews use different language to describe contributions by women and men in the workplace. We read stories in the news about ageism in healthcare and racism in mortgage lending. Data scientists are problem solvers at heart, and we love our data and our algorithms that sometimes seem to work like magic, so we may be inclined to try to solve these problems stemming from human bias by turning the decisions over to machines.
It did, however, identify other Nazi leaders like Joseph Mengele and Joseph Goebbels. Microsoft (MSFT) released CaptionBot a few weeks after its disastrous social experiment with Tay, an automated chat program designed to talk like a teen. Related: Microsoft'deeply sorry' for chat bot's racist tweets In addition to ignoring pictures of Hitler, CaptionBot also seemed to refuse to identify people like Osama bin Laden. Generally speaking, bots are software programs designed to hold conversations with people about data-driven tasks, such as managing schedules or retrieving data and information.