hrm
Hierarchical Reasoning Models: Perspectives and Misconceptions
Ge, Renee, Liao, Qianli, Poggio, Tomaso
Transformers have demonstrated remarkable performance in natural language processing and related domains, as they largely focus on sequential, autoregressive next-token prediction tasks. An emerging exploration in this direction is the Hierarchical Reasoning Model Wang et al. (2025a), which introduces a novel type of recurrent reasoning in the latent space of transformers, achieving remarkable performance on a wide range of 2D reasoning tasks. Despite the promising results, this line of models is still at an early stage and calls for in-depth investigation. In this work, we review this class of models, examine key design choices, test alternative variants and clarify common misconceptions. The Transformer architecture (V aswani et al., 2017) has become the foundation of modern large language models (LLMs), powering systems such as BERT (Devlin et al., 2019), PaLM (Chowdhery et al., 2022) and GPT series (Brown et al., 2020; Achiam et al., 2023).
Less is More: Recursive Reasoning with Tiny Networks
Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on hard puzzle tasks such as Sudoku, Maze, and ARC-AGI while trained with small models (27M parameters) on small data (around 1000 examples). HRM holds great promise for solving hard problems with small networks, but it is not yet well understood and may be suboptimal. We propose Tiny Recursive Model (TRM), a much simpler recursive reasoning approach that achieves significantly higher generalization than HRM, while using a single tiny network with only 2 layers. With only 7M parameters, TRM obtains 45% test-accuracy on ARC-AGI-1 and 8% on ARC-AGI-2, higher than most LLMs (e.g., Deepseek R1, o3-mini, Gemini 2.5 Pro) with less than 0.01% of the parameters.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
Exploring the Hierarchical Reasoning Model for Small Natural-Image Classification Without Augmentation
This paper asks whether the Hierarchical Reasoning Model (HRM) with the two Transformer-style modules $(f_L,f_H)$, one step (DEQ-style) training, deep supervision, Rotary Position Embeddings, and RMSNorm can serve as a practical image classifier. It is evaluated on MNIST, CIFAR-10, and CIFAR-100 under a deliberately raw regime: no data augmentation, identical optimizer family with one-epoch warmup then cosine-floor decay, and label smoothing. HRM optimizes stably and performs well on MNIST ($\approx 98\%$ test accuracy), but on small natural images it overfits and generalizes poorly: on CIFAR-10, HRM reaches 65.0\% after 25 epochs, whereas a two-stage Conv--BN--ReLU baseline attains 77.2\% while training $\sim 30\times$ faster per epoch; on CIFAR-100, HRM achieves only 29.7\% test accuracy despite 91.5\% train accuracy, while the same CNN reaches 45.3\% test with 50.5\% train accuracy. Loss traces and error analyses indicate healthy optimization but insufficient image-specific inductive bias for HRM in this regime. It is concluded that, for small-resolution image classification without augmentation, HRM is not competitive with even simple convolutional architectures as the HRM currently exist but this does not exclude possibilities that modifications to the model may allow it to improve greatly.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States (0.04)
Hierarchical Reasoning Model
Wang, Guan, Li, Jin, Sun, Yuhao, Chen, Xing, Liu, Changling, Wu, Yue, Lu, Meng, Song, Sen, Yadkori, Yasin Abbasi
Reasoning, the process of devising and executing complex goal-oriented action sequences, remains a critical challenge in AI. Current large language models (LLMs) primarily employ Chain-of-Thought (CoT) techniques, which suffer from brittle task decomposition, extensive data requirements, and high latency. Inspired by the hierarchical and multi-timescale processing in the human brain, we propose the Hierarchical Reasoning Model (HRM), a novel recurrent architecture that attains significant computational depth while maintaining both training stability and efficiency. HRM executes sequential reasoning tasks in a single forward pass without explicit supervision of the intermediate process, through two interdependent recurrent modules: a high-level module responsible for slow, abstract planning, and a low-level module handling rapid, detailed computations. With only 27 million parameters, HRM achieves exceptional performance on complex reasoning tasks using only 1000 training samples. The model operates without pre-training or CoT data, yet achieves nearly perfect performance on challenging tasks including complex Sudoku puzzles and optimal path finding in large mazes. Furthermore, HRM outperforms much larger models with significantly longer context windows on the Abstraction and Reasoning Corpus (ARC), a key benchmark for measuring artificial general intelligence capabilities. These results underscore HRM's potential as a transformative advancement toward universal computation and general-purpose reasoning systems.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.88)
Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models
Wang, Teng, Jiang, Zhangyi, He, Zhenqi, Yang, Wenhan, Zheng, Yanan, Li, Zeyu, He, Zifan, Tong, Shenyang, Gong, Hailei
Recent studies show that Large Language Models (LLMs) achieve strong reasoning capabilities through supervised fine-tuning or reinforcement learning. However, a key approach, the Process Reward Model (PRM), suffers from reward hacking, making it unreliable in identifying the best intermediate step. In this paper, we propose a novel reward model approach, Hierarchical Reward Model (HRM), which evaluates both individual and consecutive reasoning steps from fine-grained and coarse-grained level. HRM excels in evaluating multi-step reasoning coherence and self-reflection, especially in scenarios where the previous reasoning step is incorrect but the subsequent step successfully identifies and corrects the error. Furthermore, to address the inefficiency of autonomous annotating PRM training data via Monte Carlo Tree Search (MCTS), we introduce a lightweight and effective data augmentation strategy called Hierarchical Node Compression (HNC) based on node merging (combining two consecutive reasoning steps into one step) in the tree structure. By applying HNC to MCTS-generated reasoning trajectories, we enhance the diversity and robustness of HRM training data while introducing controlled noise with minimal computational overhead. Empirical results on the PRM800K dataset demonstrate that HRM, in conjunction with HNC, achieves superior stability and reliability in evaluation compared to PRM. Furthermore, cross-domain evaluations on MATH500 and GSM8K dataset confirm HRM's superior generalization and robustness across diverse reasoning tasks.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Hierarchies of Reward Machines
Furelos-Blanco, Daniel, Law, Mark, Jonsson, Anders, Broda, Krysia, Russo, Alessandra
Hierarchical reinforcement learning (HRL; Barto & Mahadevan, 2003) frameworks, such as options (Sutton et al., Reward machines (RMs) are a recent formalism 1999), have been used to exploit RMs by learning policies for representing the reward function of a reinforcement at two levels of abstraction: (i) select a formula (i.e., subgoal) learning task through a finite-state machine from a given RM state, and (ii) select an action to whose edges encode subgoals of the task using (eventually) satisfy the chosen formula (Toro Icarte et al., high-level events. The structure of RMs enables 2018; Furelos-Blanco et al., 2021). The subtask decomposition the decomposition of a task into simpler and independently powered by HRL enables learning at multiple scales solvable subtasks that help tackle longhorizon simultaneously, and eases the handling of long-horizon and and/or sparse reward tasks. We propose sparse reward tasks. In addition, several works have considered a formalism for further abstracting the subtask the problem of learning the RMs themselves from structure by endowing an RM with the ability to interaction (e.g., Toro Icarte et al., 2019; Xu et al., 2020; call other RMs, thus composing a hierarchy of Furelos-Blanco et al., 2021; Hasanbeig et al., 2021).
- Europe > United Kingdom > England > Greater London > London (0.04)
- North America > United States > Massachusetts (0.04)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- (4 more...)
- Workflow (0.67)
- Research Report (0.63)
- Education (0.67)
- Leisure & Entertainment (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)
ais-potential-to-improve-human-resource-management-hrm
H.R. A.I. has transformed many other fields. Recent studies show that HRM's future relies on collaboration between man and machine. A.I. A.I. can enhance human workers' efforts and help them save time. A.I. is just like any other field. When it comes to data analysis, A.I. shines. Although it might not seem like it, A.I.