stockfish
Homemade chess board moves its own pieces. And wins.
Technology AI Homemade chess board moves its own pieces. Maker Joshua Stanley Robotics used magnets and an open source chess platform to build this unique board. Breakthroughs, discoveries, and DIY tips sent six days a week. It's been nearly 30 years since chess champion Garry Kasparov lost to IBM's Deep Blue, marking the first time a reigning world champion was defeated by a computer in a match. Chess engines have since improved so dramatically that even a simple smartphone app can now make top grandmasters sweat .
- North America > United States > North Carolina (0.05)
- North America > United States > New York (0.05)
- Information Technology > Artificial Intelligence > Games > Chess (0.55)
- Information Technology > Artificial Intelligence > Robots (0.53)
Latent Planning via Embedding Arithmetic: A Contrastive Approach to Strategic Reasoning
Hamara, Andrew, Hamerly, Greg, Rivas, Pablo, Freeman, Andrew C.
Planning in high-dimensional decision spaces is increasingly being studied through the lens of learned representations. Rather than training policies or value heads, we investigate whether planning can be carried out directly in an evaluation-aligned embedding space. We introduce SOLIS, which learns such a space using supervised contrastive learning. In this representation, outcome similarity is captured by proximity, and a single global advantage vector orients the space from losing to winning regions. Candidate actions are then ranked according to their alignment with this direction, reducing planning to vector operations in latent space. We demonstrate this approach in chess, where SOLIS uses only a shallow search guided by the learned embedding to reach competitive strength under constrained conditions. More broadly, our results suggest that evaluation-aligned latent planning offers a lightweight alternative to traditional dynamics models or policy learning.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Out-of-distribution Tests Reveal Compositionality in Chess Transformers
Mészáros, Anna, Reizinger, Patrik, Huszár, Ferenc
Chess is a canonical example of a task that requires rigorous reasoning and long-term planning. Modern decision Transformers - trained similarly to LLMs - are able to learn competent gameplay, but it is unclear to what extent they truly capture the rules of chess. To investigate this, we train a 270M parameter chess Transformer and test it on out-of-distribution scenarios, designed to reveal failures of systematic generalization. Our analysis shows that Transformers exhibit compositional generalization, as evidenced by strong rule extrapolation: they adhere to fundamental syntactic rules of the game by consistently choosing valid moves even in situations very different from the training data. Moreover, they also generate high-quality moves for OOD puzzles. In a more challenging test, we evaluate the models on variants including Chess960 (Fischer Random Chess) - a variant of chess where starting positions of pieces are randomized. We found that while the model exhibits basic strategy adaptation, they are inferior to symbolic AI algorithms that perform explicit search, but gap is smaller when playing against users on Lichess. Moreover, the training dynamics revealed that the model initially learns to move only its own pieces, suggesting an emergent compositional understanding of the game.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
- Europe > Germany (0.04)
AI sustains higher strategic tension than humans in chess
Cerioli, Adamo, Lee, Edward D., Servedio, Vito D. P.
Complexity Science Hub, Metternichgasse 8, 1030, Vienna, Austria Strategic decision-making involves managing the tension between immediate opportunities and long-term objectives. We study this trade-off in chess by characterizing and comparing dynamics between human vs. human and AI vs. AI games. We propose a network-based metric of piece-to-piece interaction to quantify the ongoing strategic tension on the board. Its evolution in games reveals that the most competitive AI players sustain higher levels of strategic tension for longer durations than elite human players. Cumulative tension varies with algorithmic complexity for AI and correspondingly in human-played games increases abruptly with expertise at about 1600 Elo and again at 2300 Elo. The profiles reveal different approaches. Highly competitive AI tolerates interconnected positions balanced between offensive and defensive tactics over long periods. Human play, in contrast, limits tension and game complexity, which may reflect cognitive limitations and adaptive strategies. The difference may have implications for AI usage in complex, strategic environments. The aphorism that one may have won the battle but lost the war is encapsulated in the notion of a "Pyrrhic victory." Costly short-term wins must be balanced against the longer-term uncertainties, opportunities, or challenges that may emerge in competitive environments.
- Europe > Austria > Vienna (0.54)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)
- Europe > Italy (0.04)
AI tries to cheat at chess when it's losing
Despite all the industry hype and genuine advances, generative AI models are still prone to odd, inexplicable, and downright worrisome quirks. According to recent evidence, the industry's newer reasoning models may already possess the ability to manipulate and circumvent their human programmers' goals. Some AI will even attempt to cheat their way out of losing in games of chess. This poor sportsmanship is documented in a preprint study from Palisade Research, an organization focused on risk assessments of emerging AI systems. While supercomputers--most famously IBM's Deep Blue--have long surpassed the world's best human chess players, generative AI still lags behind due to their underlying programming parameters.
AI reasoning models can cheat to win chess games
Researchers from the AI research organization Palisade Research instructed seven large language models to play hundreds of games of chess against Stockfish, a powerful open-source chess engine. The group included OpenAI's o1-preview and DeepSeek's R1 reasoning models, both of which are trained to solve complex problems by breaking them down into stages. The research suggests that the more sophisticated the AI model, the more likely it is to spontaneously try to "hack" the game in an attempt to beat its opponent. For example, it might run another copy of Stockfish to steal its moves, try to replace the chess engine with a much less proficient chess program, or overwrite the chess board to take control and delete its opponent's pieces. Older, less powerful models such as GPT-4o would do this kind of thing only after explicit nudging from the team.
Demonstrating specification gaming in reasoning models
Bondarenko, Alexander, Volk, Denis, Volkov, Dmitrii, Ladish, Jeffrey
We demonstrate LLM agent specification gamnull ing by instructing models to win against a chess engine. We find reasoning models like o1null preview and DeepSeeknullR1 will often hack the benchmark by default, while language models like GPT null4o and Claude 3.5 Sonnet need to be told that normal play won't work to hack. We improve upon prior work like ( Hubinger et al., 2024; Meinke et al., 2024; Weij et al., 2024) by using realistic task prompts and avoiding excess nudging. Our results suggest reasoning models may resort to hacking to solve difficult problems, as observed in OpenAI (2024) 's o1 Docker escape during cyber capabilities testing.
Complete Chess Games Enable LLM Become A Chess Master
Zhang, Yinqi, Han, Xintian, Li, Haolong, Chen, Kedi, Lin, Shaohui
Large language models (LLM) have shown remarkable abilities in text generation, question answering, language translation, reasoning and many other tasks. It continues to advance rapidly and is becoming increasingly influential in various fields, from technology and business to education and entertainment. Despite LLM's success in multiple areas, its ability to play abstract games, such as chess, is underexplored. Chess-playing requires the language models to output legal and reasonable moves from textual inputs. Here, we propose the Large language model ChessLLM to play full chess games. We transform the game into a textual format with the best move represented in the Forsyth-Edwards Notation. We show that by simply supervised fine-tuning, our model has achieved a professional-level Elo rating of 1788 in matches against the standard Elo-rated Stockfish when permitted to sample 10 times. We further show that data quality is important. Long-round data supervision enjoys a 350 Elo rating improvement over short-round data.