brainteaser
RISCORE: Enhancing In-Context Riddle Solving in Language Models through Context-Reconstructed Example Augmentation
Panagiotopoulos, Ioannis, Filandrianos, Giorgos, Lymperaiou, Maria, Stamou, Giorgos
Riddle-solving requires advanced reasoning skills, pushing LLMs to engage in abstract thinking and creative problem-solving, often revealing limitations in their cognitive abilities. In this paper, we examine the riddle-solving capabilities of LLMs using a multiple-choice format, exploring how different prompting techniques impact performance on riddles that demand diverse reasoning skills. To enhance results, we introduce RISCORE (RIddle Solving with COntext REcontruciton) a novel fully automated prompting method that generates and utilizes contextually reconstructed sentence-based puzzles in conjunction with the original examples to create few-shot exemplars. Our experiments demonstrate that RISCORE significantly improves the performance of language models in both vertical and lateral thinking tasks, surpassing traditional exemplar selection strategies across a variety of few-shot settings.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Workflow (0.93)
Only people with a high IQ can spot the number '6' in this brainteaser in 10 seconds
A new brain teaser claims that only highly intelligent people can spot the number six in a sea of nines. The goal is to find the designated number within 10 seconds or less, making it important to carefully look through the image. Spotting the number requires quick thinking, and if you can find the letter in the allotted timeframe, your level of intelligence is higher than people who take longer. These types of brainteasers can tell you a lot about how you think and view the world and help you develop problem-solving and logical reasoning skills. The picture shows columns of 78 numbers in total, with just one six hidden somewhere among the nines.
- Health & Medicine > Consumer Health (0.56)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.33)
- Health & Medicine > Therapeutic Area > Neurology (0.33)
Can YOU spot the second horse? Only people with high IQs can solve the brainteaser in 10 seconds
A new brain teaser claims that only highly intelligent people can spot a second horse in the majestic animal's painted coat. The picture features a full-grown horse standing in a field and asks viewers to use creative thinking to solve the brainteaser. Spotting the horse requires quick thinking, and if you can find the letter in 10 seconds or less, your level of intelligence is higher than people who take longer. A new brain teaser claims that only highly intelligent people can spot a second horse in the stallion's coat Solving the puzzle isn't so much about looking and simply seeing it, but about finding a different way to look at it. Set the timer for 10 seconds and try to find the second horse on the brown and white side of the stallion.
- Health & Medicine > Therapeutic Area > Neurology (0.56)
- Health & Medicine > Consumer Health (0.38)
Only people with eagle eyes can solve a new Rubik's cube brainteaser in under 30 seconds
The Rubik's cube is a classic mind game for all ages, challenging players to align a single color on each side. The popular 3D puzzle has been recreated into a brainteaser that shows dozens of cubes that appear identical - but there is an odd one in the bunch. The objective is to spot the cube that does not match in under 30 seconds - but only those with eagle eyes can spot it. The popular 3D puzzle has been recreated into a brainteaser that shows dozens of cubes that appear identical - but there is an odd one in the bunch. The new brainteaser was created by online gaming experts at MrQ who said the puzzle will'leave even the most eagle-eyed viewers scratching their heads in anguish.' 'It takes the average person 30 seconds to find the odd Rubik's cube out and a whopping one in three admitting to giving up finding the colorful cube completely,' the company shared.
BAMO at SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
Ansari, Baktash, Rostamkhani, Mohammadmostafa, Eetemadi, Sauleh
This paper outlines our approach to SemEval 2024 Task 9, BRAINTEASER: A Novel Task Defying Common Sense. The task aims to evaluate the ability of language models to think creatively. The dataset comprises multi-choice questions that challenge models to think "outside of the box". We fine-tune 2 models, BERT and RoBERTa Large. Next, we employ a Chain of Thought (CoT) zero-shot prompting approach with 6 large language models, such as GPT-3.5, Mixtral, and Llama2. Finally, we utilize ReConcile, a technique that employs a "round table conference" approach with multiple agents for zero-shot learning, to generate consensus answers among 3 selected language models. Our best method achieves an overall accuracy of 85 percent on the sentence puzzles subtask.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Alaska (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (3 more...)
SemEval-2024 Task 9: BRAINTEASER: A Novel Task Defying Common Sense
Jiang, Yifan, Ilievski, Filip, Ma, Kaixin
While vertical thinking relies on logical and commonsense reasoning, lateral thinking requires systems to defy commonsense associations and overwrite them through unconventional thinking. Lateral thinking has been shown to be challenging for current models but has received little attention. A recent benchmark, BRAINTEASER, aims to evaluate current models' lateral thinking ability in a zero-shot setting. In this paper, we split the original benchmark to also support fine-tuning setting and present SemEval Task 9: BRAIN-TEASER(S), the first task at this competition designed to test the system's reasoning and lateral thinking ability. As a popular task, BRAINTEASER(S)'s two subtasks receive 483 team submissions from 182 participants during the competition. This paper provides a fine-grained system analysis of the competition results, together with a reflection on what this means for the ability of the systems to reason laterally. We hope that the BRAINTEASER(S) subtasks and findings in this paper can stimulate future work on lateral thinking and robust reasoning by computational models.
- North America > United States > California (0.14)
- North America > Mexico > Mexico City > Mexico City (0.06)
- North America > United States > Washington > King County > Bellevue (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)
Can YOU spot the robot hidden in Santa's toy factory?
'Scumbag unions': Chants outside Brighton rail station Watch woman get dragged off jet by police in Detroit'We talked about life': Trump and Kanye discuss surprise meet Drag race ends in Lamborghini crashing into other cars Impressive fireball lights up Spain's Costa del Sol night sky Feliks Zemdegs breaks Rubik's cube speed-solving world record Caught on camera: Checkout line fight erupts over couponing Angry motorist challenges traffic warden over'illegal parking' Brave 12-year-old punches armed robber in the stomach Hunters forced to shoot a wild bear dead as it charges towards them Confrontation with taxi driver found in back seat with young girl
- Europe > Spain (0.26)
- North America > United States (0.06)
- Transportation > Passenger (0.57)
- Transportation > Ground > Rail (0.36)