BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
Ansari, Baktash, Rostamkhani, Mohammadmostafa, Eetemadi, Sauleh
–arXiv.org Artificial Intelligence
This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense. The task aims to evaluate the ability of language models to think creatively. The dataset comprises multi-choice questions that challenge models to think "outside of the box". We fine-tune 2 models, BERT and RoBERTa Large. Next, we employ a Chain of Thought (CoT) zero-shot prompting approach with 6 large language models, such as GPT-3.5, Mixtral, and Llama2. Finally, we utilize ReConcile, a technique that employs a "round table conference" approach with multiple agents for zero-shot learning, to generate consensus answers among 3 selected language models. Our best method achieves an overall accuracy of 85 percent on the sentence puzzles subtask.
arXiv.org Artificial Intelligence
Jun-7-2024
- Country:
- Asia
- Middle East > Iran (0.04)
- Singapore (0.04)
- North America
- Canada > Ontario
- Toronto (0.04)
- Mexico > Mexico City
- Mexico City (0.04)
- United States
- Alaska (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.14)
- Canada > Ontario
- Asia
- Genre:
- Research Report (0.50)
- Technology: