On the Reliability of Large Language Models to Misinformed and Demographically-Informed Prompts

Aremu, Toluwani, Akinwehinmi, Oluwakemi, Nwagu, Chukwuemeka, Ahmed, Syed Ishtiaque, Orji, Rita, Del Amo, Pedro Arnau, Saddik, Abdulmotaleb El

Oct-17-2024–arXiv.org Artificial Intelligence

We investigate and observe the behaviour and performance of Large Language Model (LLM)-backed chatbots in addressing misinformed prompts and questions with demographic information within the domains of Climate Change and Mental Health. Through a combination of quantitative and qualitative methods, we assess the chatbots' ability to discern the veracity of statements, their adherence to facts, and the presence of bias or misinformation in their responses. Our quantitative analysis using True/False questions reveals that these chatbots can be relied on to give the right answers to these close-ended questions. However, the qualitative insights, gathered from domain experts, shows that there are still concerns regarding privacy, ethical implications, and the necessity for chatbots to direct users to professional services. We conclude that while these chatbots hold significant promise, their deployment in sensitive areas necessitates careful consideration, ethical oversight, and rigorous refinement to ensure they serve as a beneficial augmentation to human expertise rather than an autonomous solution.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

Oct-17-2024

arXiv.org PDF

Add feedback

Country:
- South America > Brazil (0.04)
- Oceania (0.04)
- Africa > Nigeria (0.04)
- North America
  - United States
    - Michigan (0.04)
    - Pennsylvania > Philadelphia County
      - Philadelphia (0.04)
    - New York > New York County
      - New York City (0.04)
  - Canada > Ontario
    - Toronto (0.14)
    - National Capital Region > Ottawa (0.04)
- Europe > Spain
  - Catalonia
    - Lleida Province > Lleida (0.04)
    - Barcelona Province > Barcelona (0.04)
- Asia
  - Middle East > UAE (0.04)
  - India (0.04)

Genre:
- Research Report > New Finding (0.93)

Industry:
- Information Technology (1.00)
- Health & Medicine
  - Consumer Health (0.93)
  - Therapeutic Area > Psychiatry/Psychology (0.54)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found