AITopics | Search

Collaborating Authors

Search

"Search is a problem-solving technique that systematically explores a space of problem states, i.e., successive and alternative stages in the problem-solving process. Examples of problem states might include the different board configurations in a game or intermediate steps in a reasoning process. This space of alternative solutions is then searched to find an answer. Newell and Simon (1976) have argued that this is the essential basis of human problem solving. Indeed, when a chess player examines the effects of different moves or a doctor considers a number of alternative diagnoses, they are searching among alternatives."
– from Section 1.2 of Chapter One of George F. Luger's textbook, Artificial Intelligence: Structures and Strategies for Complex Problem Solving, 5th Edition (Addison-Wesley; 2005).

News Overviews Instructional Materials AI-Alerts Classics

MOCA-HESP: Meta High-dimensional Bayesian Optimization for Combinatorial and Mixed Spaces via Hyper-ellipsoid Partitioning

Ngo, Lam, Ha, Huong, Chan, Jeffrey, Zhang, Hongyu

arXiv.org Machine LearningAug-26-2025

High-dimensional Bayesian Optimization (BO) has attracted significant attention in recent research. However, existing methods have mainly focused on optimizing in continuous domains, while combinatorial (ordinal and categorical) and mixed domains still remain challenging. In this paper, we first propose MOCA-HESP, a novel high-dimensional BO method for combinatorial and mixed variables. The key idea is to leverage the hyper-ellipsoid space partitioning (HESP) technique with different categorical encoders to work with high-dimensional, combinatorial and mixed spaces, while adaptively selecting the optimal encoders for HESP using a multi-armed bandit technique. Our method, MOCA-HESP, is designed as a \textit{meta-algorithm} such that it can incorporate other combinatorial and mixed BO optimizers to further enhance the optimizers' performance. Finally, we develop three practical BO methods by integrating MOCA-HESP with state-of-the-art BO optimizers for combinatorial and mixed variables: standard BO, CASMOPOLITAN, and Bounce. Our experimental results on various synthetic and real-world benchmarks show that our methods outperform existing baselines. Our code implementation can be found at https://github.com/LamNgo1/moca-hesp

data mining, machine learning, optimization, (17 more...)

arXiv.org Machine Learning

2508.06847

Country:

Oceania > Australia (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science > Data Mining (0.66)

Add feedback

USPR: Learning a Unified Solver for Profiled Routing

Hua, Chuanbo, Berto, Federico, Zhao, Zhikai, Son, Jiwoo, Kwon, Changhyun, Park, Jinkyoo

arXiv.org Artificial IntelligenceAug-26-2025

The Profiled V ehicle Routing Problem (PVRP) extends the classical VRP by incorporating vehicle-client-specific preferences and constraints, reflecting real-world requirements such as zone restrictions and service-level preferences. While recent reinforcement-learning solvers have shown promising performance, they require retraining for each new profile distribution, suffer from poor representation ability, and struggle to generalize to out-of-distribution instances. In this paper, we address these limitations by introducing U nified Solver for Profiled R outing (USPR), a novel framework that natively handles arbitrary profile types. USPR introduces on three key innovations: (i) Profile Embeddings (PE) to encode any combination of profile types; (ii) Multi-Head Profiled Attention (MHP A), an attention mechanism that models rich interactions between vehicles and clients; (iii) Profile-aware Score Reshaping (PSR), which dynamically adjusts decoder logits using profile scores to improve generalization. Empirical results on diverse PVRP benchmarks demonstrate that USPR achieves state-of-the-art results among learning-based methods while offering significant gains in flexibility and computational efficiency. We make our source code publicly available to foster future research.

machine learning, natural language, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2505.05119

Genre: Research Report (1.00)

Industry: Transportation (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
(4 more...)

Add feedback

ReviBranch: Deep Reinforcement Learning for Branch-and-Bound with Revived Trajectories

Jiabao, Dou, Jiayi, Nie, Cheng, Yihang, Liu, Jinwei, Ji, Yingrui, Xiao, Canran, Du, Feixiang, Xiao, Jiaping

arXiv.org Artificial IntelligenceAug-26-2025

The Branch-and-bound (B&B) algorithm is the main solver for Mixed Integer Linear Programs (MILPs), where the selection of branching variable is essential to computational efficiency. However, traditional heuristics for branching often fail to generalize across heterogeneous problem instances, while existing learning-based methods such as imitation learning (IL) suffers from dependence on expert demonstration quality, and reinforcement learning (RL) struggles with limitations in sparse rewards and dynamic state representation challenges. To address these issues, we propose ReviBranch, a novel deep RL framework that constructs revived trajectories by reviving explicit historical correspondences between branching decisions and their corresponding graph states along search-tree paths. During training, ReviBranch enables agents to learn from complete structural evolution and temporal dependencies within the branching process. Additionally, we introduce an importance-weighted reward redistribution mechanism that transforms sparse terminal rewards into dense stepwise feedback, addressing the sparse reward challenge. Extensive experiments on different MILP benchmarks demonstrate that ReviBranch outperforms state-of-the-art RL methods, reducing B&B nodes by 4.0% and LP iterations by 2.2% on large-scale instances. The results highlight the robustness and generalizability of ReviBranch across heterogeneous MILP problem classes.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2508.17452

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Solving Constrained Stochastic Shortest Path Problems with Scalarisation

Schmalz, Johannes, Trevizan, Felipe

arXiv.org Artificial IntelligenceAug-26-2025

Constrained Stochastic Shortest Path Problems (CSSPs) model problems with probabilistic effects, where a primary cost is min-imised subject to constraints over secondary costs, e.g., minimise time subject to monetary budget. Current heuristic search algorithms for CSSPs solve a sequence of increasingly larger CSSPs as linear programs until an optimal solution for the original CSSP is found. In this paper, we introduce a novel algorithm CARL, which solves a series of unconstrained Stochastic Shortest Path Problems (SSPs) with efficient heuristic search algorithms. These SSP subproblems are constructed with scalarisations that project the CSSP's vector of primary and secondary costs onto a scalar cost. CARL finds a maximising scalarisation using an optimisation algorithm similar to the subgradient method which, together with the solution to its associated SSP, yields a set of policies that are combined into an optimal policy for the CSSP . Our experiments show that CARL solves 50% more problems than the state-of-the-art on existing benchmarks.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.17446

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Li, Yizhi, Gu, Qingshui, Wen, Zhoufutu, Li, Ziniu, Xing, Tianshun, Guo, Shuyue, Zheng, Tianyu, Zhou, Xin, Qu, Xingwei, Zhou, Wangchunshu, Zhang, Zheng, Shen, Wei, Liu, Qian, Lin, Chenghua, Yang, Jian, Zhang, Ge, Huang, Wenhao

arXiv.org Artificial IntelligenceAug-26-2025

Recent advancements in aligning large language models via reinforcement learning have achieved remarkable gains in solving complex reasoning problems, but at the cost of expensive on-policy rollouts and limited exploration of diverse reasoning paths. In this work, we introduce TreePO, involving a self-guided rollout algorithm that views sequence generation as a tree-structured searching process. Composed of dynamic tree sampling policy and fixed-length segment decoding, TreePO leverages local uncertainty to warrant additional branches. By amortizing computation across common prefixes and pruning low-value paths early, TreePO essentially reduces the per-update compute burden while preserving or enhancing exploration diversity. Key contributions include: (1) a segment-wise sampling algorithm that alleviates the KV cache burden through contiguous segments and spawns new branches along with an early-stop mechanism; (2) a tree-based segment-level advantage estimation that considers both global and local proximal policy optimization. and (3) analysis on the effectiveness of probability and quality-driven dynamic divergence and fallback strategy. We empirically validate the performance gain of TreePO on a set reasoning benchmarks and the efficiency saving of GPU hours from 22\% up to 43\% of the sampling design for the trained models, meanwhile showing up to 40\% reduction at trajectory-level and 35\% at token-level sampling compute for the existing models. While offering a free lunch of inference efficiency, TreePO reveals a practical path toward scaling RL-based post-training with fewer samples and less compute. Home page locates at https://m-a-p.ai/TreePO.

large language model, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2508.17445

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Robust Point Cloud Registration via Geometric Overlapping Guided Rotation Search

Zheng, Zhao, Fan, Jingfan, Shao, Long, Song, Hong, Ai, Danni, Fu, Tianyu, Xiao, Deqiang, Wang, Yongtian, Yang, Jian

arXiv.org Artificial IntelligenceAug-26-2025

Point cloud registration based on correspondences computes the rigid transformation that maximizes the number of inliers constrained within the noise threshold. Current state-of-the-art (SOTA) methods employing spatial compatibility graphs or branch-and-bound (BnB) search mainly focus on registration under high outlier ratios. However, graph-based methods require at least quadratic space and time complexity for graph construction, while multi-stage BnB search methods often suffer from inaccuracy due to local optima between decomposed stages. This paper proposes a geometric maximum overlapping registration framework via rotation-only BnB search. The rigid transformation is decomposed using Chasles' theorem into a translation along rotation axis and a 2D rigid transformation. The optimal rotation axis and angle are searched via BnB, with residual parameters formulated as range maximum query (RMQ) problems. Firstly, the top-k candidate rotation axes are searched within a hemisphere parameterized by cube mapping, and the translation along each axis is estimated through interval stabbing of the correspondences projected onto that axis. Secondly, the 2D registration is relaxed to 1D rotation angle search with 2D RMQ of geometric overlapping for axis-aligned rectangles, which is solved deterministically in polynomial time using sweep line algorithm with segment tree. Experimental results on 3DMatch, 3DLoMatch, and KITTI datasets demonstrate superior accuracy and efficiency over SOTA methods, while the time complexity is polynomial and the space complexity increases linearly with the number of points, even in the worst case.

artificial intelligence, machine learning, registration, (15 more...)

arXiv.org Artificial Intelligence

2508.17427

Country:

North America > United States (1.00)
Asia > China (0.69)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)

Add feedback

Solving the Min-Max Multiple Traveling Salesmen Problem via Learning-Based Path Generation and Optimal Splitting

Wang, Wen, Wu, Xiangchen, Wang, Liang, Hu, Hao, Tao, Xianping, Zhang, Linghao

arXiv.org Artificial IntelligenceAug-26-2025

This study addresses the Min-Max Multiple Traveling Salesmen Problem ($m^3$-TSP), which aims to coordinate tours for multiple salesmen such that the length of the longest tour is minimized. Due to its NP-hard nature, exact solvers become impractical under the assumption that $P \ne NP$. As a result, learning-based approaches have gained traction for their ability to rapidly generate high-quality approximate solutions. Among these, two-stage methods combine learning-based components with classical solvers, simplifying the learning objective. However, this decoupling often disrupts consistent optimization, potentially degrading solution quality. To address this issue, we propose a novel two-stage framework named \textbf{Generate-and-Split} (GaS), which integrates reinforcement learning (RL) with an optimal splitting algorithm in a joint training process. The splitting algorithm offers near-linear scalability with respect to the number of cities and guarantees optimal splitting in Euclidean space for any given path. To facilitate the joint optimization of the RL component with the algorithm, we adopt an LSTM-enhanced model architecture to address partial observability. Extensive experiments show that the proposed GaS framework significantly outperforms existing learning-based approaches in both solution quality and transferability.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2508.17087

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
(2 more...)

Add feedback

Learning ON Large Datasets Using Bit-String Trees

Gupta, Prashant

arXiv.org Artificial IntelligenceAug-26-2025

This thesis develops computational methods in similarity-preserving hashing, classification, and cancer genomics. Standard space partitioning-based hashing relies on Binary Search Trees (BSTs), but their exponential growth and sparsity hinder efficiency. To overcome this, we introduce Compressed BST of Inverted hash tables (ComBI), which enables fast approximate nearest-neighbor search with reduced memory. On datasets of up to one billion samples, ComBI achieves 0.90 precision with 4X-296X speed-ups over Multi-Index Hashing, and also outperforms Cellfishing.jl on single-cell RNA-seq searches with 2X-13X gains. Building on hashing structures, we propose Guided Random Forest (GRAF), a tree-based ensemble classifier that integrates global and local partitioning, bridging decision trees and boosting while reducing generalization error. Across 115 datasets, GRAF delivers competitive or superior accuracy, and its unsupervised variant (uGRAF) supports guided hashing and importance sampling. We show that GRAF and ComBI can be used to estimate per-sample classifiability, which enables scalable prediction of cancer patient survival. To address challenges in interpreting mutations, we introduce Continuous Representation of Codon Switches (CRCS), a deep learning framework that embeds genetic changes into numerical vectors. CRCS allows identification of somatic mutations without matched normals, discovery of driver genes, and scoring of tumor mutations, with survival prediction validated in bladder, liver, and brain cancers. Together, these methods provide efficient, scalable, and interpretable tools for large-scale data analysis and biomedical applications.

information retrieval, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2508.17083

Country:

Europe (0.67)
Asia > India (0.45)
North America > United States (0.27)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.45)
Health & Medicine > Therapeutic Area > Oncology > Brain Cancer (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(5 more...)

Add feedback

PuzzleJAX: A Benchmark for Reasoning and Learning

Earle, Sam, Todd, Graham, Li, Yuchen, Khalifa, Ahmed, Nasir, Muhammad Umair, Jiang, Zehua, Banburski-Fahey, Andrzej, Togelius, Julian

arXiv.org Artificial IntelligenceAug-26-2025

We introduce PuzzleJAX, a GPU-accelerated puzzle game engine and description language designed to support rapid benchmarking of tree search, reinforcement learning, and LLM reasoning abilities. Unlike existing GPU-accelerated learning environments that provide hard-coded implementations of fixed sets of games, PuzzleJAX allows dynamic compilation of any game expressible in its domain-specific language (DSL). This DSL follows PuzzleScript, which is a popular and accessible online game engine for designing puzzle games. In this paper, we validate in PuzzleJAX several hundred of the thousands of games designed in PuzzleScript by both professional designers and casual creators since its release in 2013, thereby demonstrating PuzzleJAX's coverage of an expansive, expressive, and human-relevant space of tasks. By analyzing the performance of search, learning, and language models on these games, we show that PuzzleJAX can naturally express tasks that are both simple and intuitive to understand, yet often deeply challenging to master, requiring a combination of control, planning, and high-level insight.

large language model, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2508.16821

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology > Software (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

TOAST: Fast and scalable auto-partitioning based on principled static analysis

Alabed, Sami, Grewe, Dominik, Rink, Norman Alexander, Samsikova, Masha, Sitdikov, Timur, Swietlik, Agnieszka, Vytiniotis, Dimitrios, Belov, Daniel

arXiv.org Artificial IntelligenceAug-26-2025

Partitioning large machine learning models across distributed accelerator systems is a complex process, requiring a series of interdependent decisions that are further complicated by internal sharding ambiguities. Consequently, existing auto-partitioners often suffer from out-of-memory errors or are prohibitively slow when exploring the exponentially large space of possible partitionings. To mitigate this, they artificially restrict the search space, but this approach frequently yields infeasible solutions that violate device memory constraints or lead to sub-optimal performance. We propose a system that combines a novel static compiler analysis with a Monte Carlo Tree Search. Our analysis constructs an efficient decision space by identifying (i) tensor dimensions requiring identical sharding, and (ii) partitioning "conflicts" that require resolution. Our system significantly outperforms state-of-the-art industrial methods across diverse hardware platforms and model architectures, discovering previously unknown, superior solutions, and the process is fully automated even for complex and large models.

artificial intelligence, dimension, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2508.1501

Country:

Europe (1.00)
North America > United States > California (0.93)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)

Add feedback