baba
Baba is LLM: Reasoning in a Game with Dynamic Rules
van Wetten, Fien, Plaat, Aske, van Duijn, Max
Large language models (LLMs) are known to perform well on language tasks, but struggle with reasoning tasks. This paper explores the ability of LLMs to play the 2D puzzle game Baba is You, in which players manipulate rules by rearranging text blocks that define object properties. Given that this rule-manipulation relies on language abilities and reasoning, it is a compelling challenge for LLMs. Six LLMs are evaluated using different prompt types, including (1) simple, (2) rule-extended and (3) action-extended prompts. In addition, two models (Mistral, OLMo) are finetuned using textual and structural data from the game. Results show that while larger models (particularly GPT-4o) perform better in reasoning and puzzle solving, smaller unadapted models struggle to recognize game mechanics or apply rule changes. Finetuning improves the ability to analyze the game levels, but does not significantly improve solution formulation. We conclude that even for state-of-the-art and finetuned LLMs, reasoning about dynamic rule changes is difficult (specifically, understanding the use-mention distinction). The results provide insights into the applicability of LLMs to complex problem-solving tasks and highlight the suitability of games with dynamically changing rules for testing reasoning and reflection by LLMs.
- North America > Montserrat (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Europe > Switzerland (0.04)
- (2 more...)
Baba Is AI: Break the Rules to Beat the Benchmark
Cloos, Nathan, Jens, Meagan, Naim, Michelangelo, Kuo, Yen-Ling, Cases, Ignacio, Barbu, Andrei, Cueva, Christopher J.
Humans solve problems by following existing rules and procedures, and also by leaps of creativity to redefine those rules and objectives. To probe these abilities, we developed a new benchmark based on the game Baba Is You where an agent manipulates both objects in the environment and rules, represented by movable tiles with words written on them, to reach a specified goal and win the game. We test three state-of-the-art multi-modal large language models (OpenAI GPT-4o, Google Gemini-1.5-Pro and Gemini-1.5-Flash) and find that they fail dramatically when generalization requires that the rules of the game must be manipulated and combined.
- Europe > Austria > Vienna (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Virginia (0.04)
- Asia > Middle East > Jordan (0.04)
A fuzzy logic-based stabilization system for a flying robot, with an embedded energy harvester and a visual decision-making system
Baba, Abdullatif, Alothman, Basel
"Smart cities" is the trendy rubric of modern urban projects that require new innovative ideas to attain the desired perfection in many fields to change our life for the better. In this context, a new innovative application will be presented here to investigate and continuously make the required maintenance of public roads by creating a flying robot for painting the partially erased parts of sidewalks' edges that are usually plated in two different colors; primarily black and white as we suppose here. The first contribution of this paper is developing a fuzzy-logic-based stabilization system for an octocopter serving as a liquids transporter that could be equipped with a robot arm. The second contribution consists of designing an embedded energy harvester for the flying robot to promote the management of available power sources. Finally, as suggested in this project, we present a complement heuristic study clarifying some main concepts that rely on a computer vision-based decision-making system.
- Asia > Middle East > Kuwait (0.05)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
- Transportation > Infrastructure & Services (1.00)
- Energy > Power Industry (1.00)
- Transportation > Air (0.94)
Keke AI Competition: Solving puzzle levels in a dynamically changing mechanic space
Abstract--The Keke AI Competition introduces an artificial agent competition for the game Baba is You - a Sokoban-like puzzle game where players can create rules that influence the mechanics of the game. Altering a rule can cause temporary or permanent effects for the rest of the level that could be part of the solution space. The nature of these dynamic rules and the deterministic aspect of the game creates a challenge for AI to adapt to a variety of mechanic combinations in order to solve a level. With the increasing depth and complexity of puzzle games comes the increasing need for intelligent solvers for these games. For example, in the puzzle game Sokoban, a is Hard To Build, Monument Valley, Braid, VVVVVV) player must push each crate to designated positions on the with player-controlled dynamic mechanics that can temporarily map in order to solve the puzzle.
- North America > United States > New York > Richmond County > New York City (0.04)
- North America > United States > New York > Queens County > New York City (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
5 Video Games You've Never Heard Of But You Should Be Playing
Big-ticket games like The Division 2 and Sekiro are great, but sometimes you might be in the mood for something a little different -- and maybe a little cheaper, too. Thankfully, we're living in the golden age of indie video games that you can download right to your console or PC for a pittance. Puzzle games are more fun when you get to break the rules. Baba Is You developer Hempuli Oy understands that, because Baba Is You is all about changing the rules of the game in your favor to win. The setup is simple--players take control of Baba, a cute little 2D animal, and try to reach to a flag.
Sekiro, Baba Is You and the politics of video game difficulty
As a one-armed orphan – a disability that you might think would disqualify him from the opportunity to work as a lone assassin in 16th-century Japan – Sekiro is well acquainted with disadvantage. Still, a smooth sea never made a skilful mariner, as they used to say, and these physical and psychological handicaps have only served to strengthen this shinobi, who, with a variety of terrifying prosthetics, must now avenge his fallen master by taking down the Ashina clan. Up close, this is grindcore game-making, in which you are forced to watch the lolling of your victims' astonished mouths as you trace a katana across their necks. This world of blood, fire and pitter-patter footsteps across bamboo rooftops calls to mind Toshiya Fujita's Lady Snowblood or Akira Kurosawa's Sanjuro in both theme and body count. But in its moments of exquisite pause, it's also a game of refined cinematic style, the traumatised ninja silhouetted against a flaring sunset, while the reeds rustle and soothe.
- Asia > Japan (0.25)
- North America > United States > District of Columbia > Washington (0.05)
Toyota's Cue 3 robot can't slam dunk or even dribble, but it shoots a mean 3-pointer
It can't dribble, let alone slam dunk, but Toyota's basketball robot hardly ever misses a free throw or a 3-pointer. The 207-centimeter-tall (6 feet 10-inches) machine made five of eight 3-point shots in a demonstration in a Tokyo suburb Monday, a ratio its engineers say is worse than usual. Toyota Motor Corp.'s robot, called Cue 3, computes a three-dimensional image where the basket is, using sensors on its torso, and adjusts motors inside its arm and knees to give the shot the right angle and propulsion for a swish. Efforts in developing human-shaped robots underline a global shift in robotics use from pre-programmed mechanical arms in limited situations like factories to functioning in the real world with people. The 2017 version of the robot was designed to make free throws.
Stock Forecasting Using AI: This Week's Top 10 Stocks, Stocks Under $10, Aggressive Stocks Specific Stock Forecasts Based on AI: AMZN, GOOG, AAPL, TSLA, BABA, More ❯❯
The US dollar had an event-heavy week to start off June. The US dollar surged early last week as the uncertainty about the Euro arose due to political events happened in Europe and the volatility in Asian markets driven by threats of an immediate trade war between the US and China. On Thursday (May 31), the Euro rebounded as Italy's politicians seemed to have found a resolution to their struggles in forming a new government. In the same day, the Trump administration announced it was putting tariffs on steel and aluminum imports from Canada, Mexico and Europe, strengthening fears over the trade war and making the US dollar suffer a slump. The US labor indicators highlighted the fundamental strength of the country's economy and made the US dollar extend gains amid the Europe geopolitical turmoil.
- North America > United States (0.59)
- North America > Mexico (0.28)
- North America > Canada (0.28)
- (2 more...)
- Banking & Finance > Economy (0.86)
- Government > Foreign Policy (0.85)
- Government > Commerce (0.85)
- (2 more...)