Goto

Collaborating Authors

 rubik


Brothers build a robot to solve Rubik's cubes in record-setting time

Popular Science

Technology Robots Brothers build a robot to solve Rubik's cubes in record-setting time The robot completed the puzzle in just 45.3 seconds, breaking its own record of 55 seconds made just moments earlier. The Revenger set a world record. Breakthroughs, discoveries, and DIY tips sent six days a week. A pair of brothers in the U.K. have officially broken the Guinness World Record for the fastest time solving a four-by-four Rubik's Cube with a robot. Their DIY machine, which the brothers call The Revenger, completed the puzzle in only 45.3 seconds.


Learning Shortest Paths with Generative Flow Networks

Morozov, Nikita, Maksimov, Ian, Tiapkin, Daniil, Samsonov, Sergey

arXiv.org Machine Learning

In this paper, we present a novel learning framework for finding shortest paths in graphs utilizing Generative Flow Networks (GFlowNets). First, we examine theoretical properties of GFlowNets in non-acyclic environments in relation to shortest paths. We prove that, if the total flow is minimized, forward and backward policies traverse the environment graph exclusively along shortest paths between the initial and terminal states. Building on this result, we show that the pathfinding problem in an arbitrary graph can be solved by training a non-acyclic GFlowNet with flow regularization. We experimentally demonstrate the performance of our method in pathfinding in permutation environments and in solving Rubik's Cubes. For the latter problem, our approach shows competitive results with state-of-the-art machine learning approaches designed specifically for this task in terms of the solution length, while requiring smaller search budget at test-time.







Rubik's Cube: High-Order Channel Interactions with a Hierarchical Receptive Field

Neural Information Processing Systems

Image restoration techniques, spanning from the convolution to the transformer paradigm, have demonstrated robust spatial representation capabilities to deliver high-quality performance.Yet, many of these methods, such as convolution and the Feed Forward Network (FFN) structure of transformers, primarily leverage the basic first-order channel interactions and have not maximized the potential benefits of higher-order modeling. To address this limitation, our research dives into understanding relationships within the channel dimension and introduces a simple yet efficient, high-order channel-wise operator tailored for image restoration. Instead of merely mimicking high-order spatial interaction, our approach offers several added benefits: Efficiency: It adheres to the zero-FLOP and zero-parameter principle, using a spatial-shifting mechanism across channel-wise groups. Simplicity: It turns the favorable channel interaction and aggregation capabilities into element-wise multiplications and convolution units with $1 \times 1$ kernel. Our new formulation expands the first-order channel-wise interactions seen in previous works to arbitrary high orders, generating a hierarchical receptive field akin to a Rubik's cube through the combined action of shifting and interactions. Furthermore, our proposed Rubik's cube convolution is a flexible operator that can be incorporated into existing image restoration networks, serving as a drop-in replacement for the standard convolution unit with fewer parameters overhead. We conducted experiments across various low-level vision tasks, including image denoising, low-light image enhancement, guided image super-resolution, and image de-blurring. The results consistently demonstrate that our Rubik's cube operator enhances performance across all tasks.


Universality in Collective Intelligence on the Rubik's Cube

Krakauer, David, Kardeş, Gülce, Grochow, Joshua

arXiv.org Artificial Intelligence

Progress in understanding expert performance is limited by the scarcity of quantitative data on long-term knowledge acquisition and deployment. Here we use the Rubik's Cube as a cognitive model system existing at the intersection of puzzle solving, skill learning, expert knowledge, cultural transmission, and group theory. By studying competitive cube communities, we find evidence for universality in the collective learning of the Rubik's Cube in both sighted and blindfolded conditions: expert performance follows exponential progress curves whose parameters reflect the delayed acquisition of algorithms that shorten solution paths. Blindfold solves form a distinct problem class from sighted solves and are constrained not only by expert knowledge but also by the skill improvements required to overcome short-term memory bottlenecks, a constraint shared with blindfold chess. Cognitive artifacts such as the Rubik's Cube help solvers navigate an otherwise enormous mathematical state space. In doing so, they sustain collective intelligence by integrating communal knowledge stores with individual expertise and skill, illustrating how expertise can, in practice, continue to deepen over the course of a single lifetime.


Imagine in Space: Exploring the Frontier of Spatial Intelligence and Reasoning Efficiency in Vision Language Models

Lian, Xiaoxing, Yang, Aidong, Zhu, Jun, Wang, Peng, Zhang, Yue

arXiv.org Artificial Intelligence

Large language models (LLMs) and vision language models (VLMs), such as DeepSeek R1,OpenAI o3, and Gemini 2.5 Pro, have demonstrated remarkable reasoning capabilities across logical inference, problem solving, and decision making. However, spatial reasoning:a fundamental component of human cognition that includes mental rotation, navigation, and spatial relationship comprehension remains a significant challenge for current advanced VLMs. We hypothesize that imagination, the internal simulation of spatial states, is the dominant reasoning mechanism within a spatial world model. To test this hypothesis and systematically probe current VLM spatial reasoning mechanisms, we introduce SpatiaLite, a fully synthetic benchmark that jointly measures spatial reasoning accuracy and reasoning efficiency. Comprehensive experiments reveal three key findings. First, advanced VLMs predominantly rely on linguistic representations for reasoning and imagination, resulting in significant deficiencies on visual centric tasks that demand perceptual spatial relations and 3D geometry transformations such as mental rotation or projection prediction. Second, advanced VLMs exhibit severe inefficiency in their current spatial reasoning mechanisms, with token usage growing rapidly as transformation complexity increases. Third, we propose an Imagery Driven Framework (IDF) for data synthesis and training, which can implicitly construct an internal world model that is critical for spatial reasoning in VLMs. Building on SpatiaLite, this work delineates the spatial reasoning limits and patterns of advanced VLMs, identifies key shortcomings, and informs future advances