Markov Models
Zero-shot Structure Learning and Planning for Autonomous Robot Navigation using Active Inference
de tinguy, Daria, Verbelen, Tim, Gamba, Emilio, Dhoedt, Bart
Autonomous navigation in unfamiliar environments requires robots to simultaneously explore, localise, and plan under uncertainty, without relying on predefined maps or extensive training. We present a biologically inspired, Active Inference-based framework, Active Inference MAPping and Planning (AIMAPP). This model unifies mapping, localisation, and decision-making within a single generative model. Inspired by hippocampal navigation, it uses topological reasoning, place-cell encoding, and episodic memory to guide behaviour. The agent builds and updates a sparse topological map online, learns state transitions dynamically, and plans actions by minimising Expected Free Energy. This allows it to balance goal-directed and exploratory behaviours. We implemented a ROS-compatible navigation system that is sensor and robot-agnostic, capable of integrating with diverse hardware configurations. It operates in a fully self-supervised manner, is resilient to drift, and supports both exploration and goal-directed navigation without any pre-training. We demonstrate robust performance in large-scale real and simulated environments against state-of-the-art planning models, highlighting the system's adaptability to ambiguous observations, environmental changes, and sensor noise. The model offers a biologically inspired, modular solution to scalable, self-supervised navigation in unstructured settings. AIMAPP is available at https://github.com/decide-ugent/AIMAPP.
StatEval: A Comprehensive Benchmark for Large Language Models in Statistics
Lu, Yuchen, Yang, Run, Zhang, Yichen, Yu, Shuguang, Dai, Runpeng, Wang, Ziwei, Xiang, Jiayi, E, Wenxin, Gao, Siran, Ruan, Xinyao, Huang, Yirui, Xi, Chenjing, Hu, Haibo, Fu, Yueming, Yu, Qinglan, Wei, Xiaobing, Gu, Jiani, Sun, Rui, Jia, Jiaxuan, Zhou, Fan
Large language models (LLMs) have advanced rapidly in recent years (Brown et al., 2020; Touvron et al., 2023), demonstrating remarkable progress in complex reasoning (Guo et al., 2025), fluent text generation, and even automated proof discovery (Yu et al., 2025). These advances have spurred growing adoption of LLMs across education, data science, and research, where they are increasingly used for tutoring, problem explanation, data analysis, and hypothesis formulation (Wu et al., 2021; Polu and Sutskever, 2020; Khan et al., 2023; Gao et al., 2023). However, despite their broad deployment in quantitative domains, the field of statistics, which forms the foundation of modern data-driven science, has received little attention in LLM evaluation. Statistics differs fundamentally from other quantitative disciplines. Rather than focusing on symbolic manipulation or fixed-form computation, it emphasizes reasoning under uncertainty, connecting probability theory, inference, regression, Bayesian analysis, multivariate methods, and asymptotic theory into a unified framework. Yet existing large-scale LLM evaluations rarely cover these competencies: statistical problems account for less than 3% of recent reasoning benchmarks (Paster et al., 2025), and when included, they are typically treated as isolated probability puzzles without structured categorization or coverage of inferential reasoning (Gao et al., 2024). This gap makes it impossible to rigorously assess whether LLMs can function as capable statisticians or support data-driven scientific discovery. To bridge this critical gap, we introduce StatEval, the first large-scale benchmark dedicated to evaluating large language models on statistical reasoning. With nearly 20,000 meticulously curated problems, StatEval covers the entire spectrum of statistics, from basic undergraduate exercises to advanced research-level challenges, captures the full 2 breadth and depth of the discipline, as illustrated in Figure 1.
HANDO: Hierarchical Autonomous Navigation and Dexterous Omni-loco-manipulation
Sun, Jingyuan, Wang, Chaoran, Zhang, Mingyu, Miao, Cui, Ji, Hongyu, Qu, Zihan, Sun, Han, Wang, Bing, Si, Qingyi
Seamless loco-manipulation in unstructured environments requires robots to leverage autonomous exploration alongside whole-body control for physical interaction. In this work, we introduce HANDO (Hierarchical Autonomous Navigation and Dexterous Omni-loco-manipulation), a two-layer framework designed for legged robots equipped with manipulators to perform human-centered mobile manipulation tasks. The first layer utilizes a goal-conditioned autonomous exploration policy to guide the robot to semantically specified targets, such as a black office chair in a dynamic environment. The second layer employs a unified whole-body loco-manipulation policy to coordinate the arm and legs for precise interaction tasks-for example, handing a drink to a person seated on the chair. We have conducted an initial deployment of the navigation module, and will continue to pursue finer-grained deployment of whole-body loco-manipulation.
MCMC: Bridging Rendering, Optimization and Generative AI
Generative artificial intelligence (AI) has made unprecedented advances in vision language models over the past two years. During the generative process, new samples (images) are generated from an unknown high-dimensional distribution. Markov Chain Monte Carlo (MCMC) methods are particularly effective in drawing samples from such complex, high-dimensional distributions. This makes MCMC methods an integral component for models like EBMs, ensuring accurate sample generation. Gradient-based optimization is at the core of modern generative models. The update step during the optimization forms a Markov chain where the new update depends only on the current state. This allows exploration of the parameter space in a memoryless manner, thus combining the benefits of gradient-based optimization and MCMC sampling. MCMC methods have shown an equally important role in physically based rendering where complex light paths are otherwise quite challenging to sample from simple importance sampling techniques. A lot of research is dedicated towards bringing physical realism to samples (images) generated from diffusion-based generative models in a data-driven manner, however, a unified framework connecting these techniques is still missing. In this course, we take the first steps toward understanding each of these components and exploring how MCMC could potentially serve as a bridge, linking these closely related areas of research. Our course aims to provide necessary theoretical and practical tools to guide students, researchers and practitioners towards the common goal of generative physically based rendering. All Jupyter notebooks with demonstrations associated to this tutorial can be found on the project webpage: https://sinbag.github.io/mcmc/
Constraints-of-Thought: A Framework for Constrained Reasoning in Language-Model-Guided Search
Alrashedy, Kamel, Srihari, Vriksha, Zaidi, Zulfiqar, Srivastava, Ridam, Tambwekar, Pradyumna, Gombolay, Matthew
While researchers have made significant progress in enabling large language models (LLMs) to perform multi-step planning, LLMs struggle to ensure that those plans align with high-level user intent and satisfy symbolic constraints, especially in complex, multi-step domains. Existing reasoning approaches such as Chain-of-Thought (CoT), Tree-of-Thought (ToT), and verifier-augmented methods, expand the search space but often yield infeasible actions or hallucinated steps. To overcome these limitations, we propose Constraints-of-Thought (Const-o-T), a framework that provides a structured prior that enables Monte Carlo Tree Search (MCTS) focus search on semantically meaningful paths. Each reasoning step is represented as an (intent, constraint) pair, which serves both to compress the search space and enforce validity. Unlike prior methods that merely generate reasoning traces or validate outputs post hoc, Const-o-T uses (intent, constraint)pairs to actively focus the search toward feasible and meaningful plans. We integrate Const-o-T into MCTS using a structured representation of intent-constraint pairs constraints prune infeasible branches and guide exploration toward semantically valid actions, improving planning efficiency and verifiable decision-making. We demonstrate across three domains Risk game, CAD code generation, and arithmetic reasoning that our approach outperforms baselines, yielding higher accuracy and stronger structural alignment. Our contribution is to demonstrate that Const-of-T offers a generalizable foundation for constraint-guided reasoning, enabling more efficient, constraint-aligned, and domain-adaptable planning with LLMs.