Can Large Language Models Change User Preference Adversarially?

Jan-5-2023–arXiv.org Artificial Intelligence

As pretrained large language models become larger in size and capabilities, it becomes increasingly important to ensure safety in their role in society and deployment in high-stakes situations. For instance, ChatGPT is a preview of the future of personal dialogue assistants and interpreting and explaining such models has become critical towards minimizing undesirable downstream consequences. Language models as personal dialogue assistants, by virtue of engaging in conversation with the user, have the ability to influence, persuade or potentially manipulate the user in adversarial settings. Franklin et al. [2022] argue for a framework to address the lack of formalism in the study of user preference and behavioral change due to these models. While adversarial change in user preferences has been studied for recommender systems Adomavicius et al. [2013], it has largely been unexplored from the lens of dialogue assistants and large language models.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

Jan-5-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - New York > New York County
    - New York City (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
- Europe
  - Italy > Tuscany
    - Florence (0.04)
  - France > Provence-Alpes-Côte d'Azur
    - Bouches-du-Rhône > Marseille (0.04)

Genre:
- Overview (0.68)
- Research Report (0.47)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Personal Assistant Systems (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found