It's possible to reverse-engineer AI chatbots to spout nonsense, smut or sensitive information


Machine-learning chatbot systems can be exploited to control what they say, according to boffins from Michigan State University and TAL AI Lab. "There exists a dark side of these models – due to the vulnerability of neural networks, a neural dialogue model can be manipulated by users to say what they want, which brings in concerns about the security of practical chatbot services," the researchers wrote in a paper (PDF) published on arXiv. They crafted a "Reverse Dialogue Generator" (RDG) to spit out a range of inputs that match up to a particular output. Text-based models normally work the other way, where outputs are generated after having been given an input. For example, given the sentence "Hi, how are you?", a computer learns to output a response like "Fine, thank you" as it learns that is one of the most common replies to that question in training data.