The Dark Side of Human Feedback: Poisoning Large Language Models via User Inputs

Open in new window