Assessing the Impact of Prompting, Persona, and Chain of Thought Methods on ChatGPT's Arithmetic Capabilities

Chen, Yuhao, Wong, Chloe, Yang, Hanwen, Aguenza, Juan, Bhujangari, Sai, Vu, Benthan, Lei, Xun, Prasad, Amisha, Fluss, Manny, Phuong, Eric, Liu, Minghao, Davis, James

Dec-22-2023–arXiv.org Artificial Intelligence

Large language models, such as ChatGPT, represent a transformative development in the field of Machine Learning. Demonstrating remarkable proficiency in generating coherent responses, these models effectively address intricate challenges, including mathematical problem-solving. To improve accuracy, researchers and practitioners have explored various methodologies, with prompting, persona, and Chain of Thoughts emerging as significant strategies aimed at augmenting ChatGPT's performance. This study's primary objective was to benchmark ChatGPT's default arithmetic capabilities and compare it to the performance of ChatGPT when utilizing prompting, persona, and Chain of Thoughts methods. Prompting involves providing specific instructions or questions to guide a language model's response generation. Persona refers to the creation of a fictional character with a distinct personality, whose perspective is utilized to generate responses. Chain of Thoughts involves the sequential connection of ideas or concepts to guide response generation. To assess the arithmetic capabilities of ChatGPT, we used three distinct datasets: MATH[1], GSM8K[2], and MMLU[3]. Each of these datasets presented a range of mathematical problems across multiple domains and difficulty levels.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Dec-22-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States > California (0.14)

Genre:
- Research Report > New Finding (1.00)

Industry:
- Education (0.69)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language
    - Chatbot (1.00)
    - Large Language Model (1.00)