Search
Learning to Retrieve and Reason on Knowledge Graph through Active Self-Reflection
Zhang, Han, Zhou, Langshi, Yang, Hanfang
Extensive research has investigated the integration of large language models (LLMs) with knowledge graphs to enhance the reasoning process. However, understanding how models perform reasoning utilizing structured graph knowledge remains underexplored. Most existing approaches rely on LLMs or retrievers to make binary judgments regarding the utilization of knowledge, which is too coarse. Meanwhile, there is still a lack of feedback mechanisms for reflection and correction throughout the entire reasoning path. This paper proposes an Active self-Reflection framework for knowledge Graph reasoning ARG, introducing for the first time an end-to-end training approach to achieve iterative reasoning grounded on structured graphs. Within the framework, the model leverages special tokens to \textit{actively} determine whether knowledge retrieval is necessary, performs \textit{reflective} critique based on the retrieved knowledge, and iteratively reasons over the knowledge graph. The reasoning paths generated by the model exhibit high interpretability, enabling deeper exploration of the model's understanding of structured knowledge. Ultimately, the proposed model achieves outstanding results compared to existing baselines in knowledge graph reasoning tasks.
I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search
Liang, Zujie, Wei, Feng, Xu, Wujiang, Chen, Lin, Qian, Yuxi, Wu, Xinhui
Recent advancements in large language models (LLMs) have shown remarkable potential in automating machine learning tasks. However, existing LLM-based agents often struggle with low-diversity and suboptimal code generation. While recent work has introduced Monte Carlo Tree Search (MCTS) to address these issues, limitations persist in the quality and diversity of thoughts generated, as well as in the scalar value feedback mechanisms used for node selection. In this study, we introduce Introspective Monte Carlo Tree Search (I-MCTS), a novel approach that iteratively expands tree nodes through an introspective process that meticulously analyzes solutions and results from parent and sibling nodes. This facilitates a continuous refinement of the node in the search tree, thereby enhancing the overall decision-making process. Furthermore, we integrate a Large Language Model (LLM)-based value model to facilitate direct evaluation of each node's solution prior to conducting comprehensive computational rollouts. A hybrid rewarding mechanism is implemented to seamlessly transition the Q-value from LLM-estimated scores to actual performance scores. This allows higher-quality nodes to be traversed earlier. Applied to the various ML tasks, our approach demonstrates a 6% absolute improvement in performance compared to the strong open-source AutoML agents, showcasing its effectiveness in enhancing agentic AutoML systems. Resource available at https://github.com/jokieleung/I-MCTS
Preordering: A hybrid of correlation clustering and partial ordering
Irmai, Jannik, Moeller, Maximilian, Andres, Bjoern
Motivation stems from the problem of estimating a binary relation on the accoun ts of a social network where the notions that one account follows, likes or reacts to another account need neither be symmetric nor asymmetric. In particular, i following j does not need to imply that j follows i, nor does it necessarily imply that j does not follow i . Clustering alone does not capture asymmetric subsets of the relation on the social network because the equivalence relation that characterizes the clustering is symmetric. Similarly, partial ordering alone does not capture symm etric subsets of the relation on the social network because partial orders are antisymmetric. Like in clustering and partial ordering, we work with the assumption t hat the relation on the social network is transitive, i.e., if i follows j and j follows k then i follows k . This assumption simplifies reality and we quantify the deviation empirically. Unlike in clusterin g and partial ordering, we do not assume symmetry or antisymmetry. To model our assump tion we introduce algorithms that output a preorder on a set, i.e., a binary relation on the set that is reflexive and transitiv e. More specifically, we introduce algorithms for the preordering problem: Definition 1. (Wakabayashi, 1998) Given a finite set V and a value c
A consensus set for the aggregation of partial rankings: the case of the Optimal Set of Bucket Orders Problem
Aledo, Juan A., Gámez, José A., Rosete, Alejandro
In rank aggregation problems (RAP), the solution is usually a consensus ranking that generalizes a set of input orderings. There are different variants that differ not only in terms of the type of rankings that are used as input and output, but also in terms of the objective function employed to evaluate the quality of the desired output ranking. In contrast, in some machine learning tasks (e.g. subgroup discovery) or multimodal optimization tasks, attention is devoted to obtaining several models/results to account for the diversity in the input data or across the search landscape. Thus, in this paper we propose to provide, as the solution to an RAP, a set of rankings to better explain the preferences expressed in the input orderings. We exemplify our proposal through the Optimal Bucket Order Problem (OBOP), an RAP which consists in finding a single consensus ranking (with ties) that generalizes a set of input rankings codified as a precedence matrix. To address this, we introduce the Optimal Set of Bucket Orders Problem (OSBOP), a generalization of the OBOP that aims to produce not a single ranking as output but a set of consensus rankings. Experimental results are presented to illustrate this proposal, showing how, by providing a set of consensus rankings, the fitness of the solution significantly improves with respect to the one of the original OBOP, without losing comprehensibility.
Multi-Target Radar Search and Track Using Sequence-Capable Deep Reinforcement Learning
Ewers, Jan-Hendrik, Cormack, David, Gibbs, Joe, Anderson, David
The research addresses sensor task management for radar systems, focusing on efficiently searching and tracking multiple targets using reinforcement learning. The approach develops a 3D simulation environment with an active electronically scanned array radar, using a multi-target tracking algorithm to improve observation data quality. Three neural network architectures were compared including an approach using fated recurrent units with multi-headed self-attention. Two pre-training techniques were applied: behavior cloning to approximate a random search strategy and an auto-encoder to pre-train the feature extractor. Experimental results revealed that search performance was relatively consistent across most methods. The real challenge emerged in simultaneously searching and tracking targets. The multi-headed self-attention architecture demonstrated the most promising results, highlighting the potential of sequence-capable architectures in handling dynamic tracking scenarios. The key contribution lies in demonstrating how reinforcement learning can optimize sensor management, potentially improving radar systems' ability to identify and track multiple targets in complex environments.
ETS: Efficient Tree Search for Inference-Time Scaling
Hooper, Coleman, Kim, Sehoon, Moon, Suhong, Dilmen, Kerem, Maheswaran, Monishwaran, Lee, Nicholas, Mahoney, Michael W., Shao, Sophia, Keutzer, Kurt, Gholami, Amir
Test-time compute scaling has emerged as a new axis along which to improve model accuracy, where additional computation is used at inference time to allow the model to think longer for more challenging problems. One promising approach for test-time compute scaling is search against a process reward model, where a model generates multiple potential candidates at each step of the search, and these partial trajectories are then scored by a separate reward model in order to guide the search process. The diversity of trajectories in the tree search process affects the accuracy of the search, since increasing diversity promotes more exploration. However, this diversity comes at a cost, as divergent trajectories have less KV sharing, which means they consume more memory and slow down the search process. Previous search methods either do not perform sufficient exploration, or else explore diverse trajectories but have high latency. We address this challenge by proposing Efficient Tree Search (ETS), which promotes KV sharing by pruning redundant trajectories while maintaining necessary diverse trajectories. ETS incorporates a linear programming cost model to promote KV cache sharing by penalizing the number of nodes retained, while incorporating a semantic coverage term into the cost model to ensure that we retain trajectories which are semantically different. We demonstrate how ETS can achieve 1.8$\times$ reduction in average KV cache size during the search process, leading to 1.4$\times$ increased throughput relative to prior state-of-the-art methods, with minimal accuracy degradation and without requiring any custom kernel implementation. Code is available at: https://github.com/SqueezeAILab/ETS.
Astra: Efficient and Money-saving Automatic Parallel Strategies Search on Heterogeneous GPUs
Wang, Peiran, Li, Haibing, Haohan, Fu, Li, Shiyong, Wang, Yanpeng, Shen, Dou
In this paper, we introduce an efficient and money-saving automatic parallel strategies search framework on heterogeneous GPUs: Astra. First, Astra searches for the efficiency-optimal parallel strategy in both GPU configurations search space (GPU types and GPU numbers) and parallel parameters search space. Then, Astra also provides the solution on heterogeneous GPUs by mathematically modeling the time consumption of heterogeneous training. At last, Astra is the first to propose the automatic parallel strategy search on money-saving. The experiment results demonstrate that Astra can achieve better throughput than expert-designed strategies. The search time cost for Astra can also be limited to 1.27 seconds in a single-GPU setting and less than 1.35 minutes in a heterogeneous-GPU setting on average with an accuracy of over 95%.
MCTS-KBQA: Monte Carlo Tree Search for Knowledge Base Question Answering
Xiong, Guanming, Li, Haochen, Zhao, Wen
This study explores how to enhance the reasoning capabilities of large language models (LLMs) in knowledge base question answering (KBQA) by leveraging Monte Carlo Tree Search (MCTS). Semantic parsing-based KBQA methods are particularly challenging as these approaches require locating elements from knowledge bases and generating logical forms, demanding not only extensive annotated data but also strong reasoning capabilities. Although recent approaches leveraging LLMs as agents have demonstrated considerable potential, these studies are inherently constrained by their linear decision-making processes. To address this limitation, we propose a MCTS-based framework that enhances LLMs' reasoning capabilities through tree search methodology. We design a carefully designed step-wise reward mechanism that requires only direct prompting of open-source instruction LLMs without additional fine-tuning. Experimental results demonstrate that our approach significantly outperforms linear decision-making methods, particularly in low-resource scenarios. Additionally, we contribute new data resources to the KBQA community by annotating intermediate reasoning processes for existing question-SPARQL datasets using distant supervision. Experimental results on the extended dataset demonstrate that our method achieves comparable performance to fully supervised models while using significantly less training data.
A Machine Learning Approach That Beats Large Rubik's Cubes
Chervov, Alexander, Khoruzhii, Kirill, Bukhal, Nikita, Naghiyev, Jalal, Zamkovoy, Vladislav, Koltsov, Ivan, Cheldieva, Lyudmila, Sychev, Arsenii, Lenin, Arsenii, Obozov, Mark, Urvanov, Egor, Romanov, Alexey
The paper proposes a novel machine learning-based approach to the pathfinding problem on extremely large graphs. This method leverages diffusion distance estimation via a neural network and uses beam search for pathfinding. We demonstrate its efficiency by finding solutions for 4x4x4 and 5x5x5 Rubik's cubes with unprecedentedly short solution lengths, outperforming all available solvers and introducing the first machine learning solver beyond the 3x3x3 case. In particular, it surpasses every single case of the combined best results in the Kaggle Santa 2023 challenge, which involved over 1,000 teams. For the 3x3x3 Rubik's cube, our approach achieves an optimality rate exceeding 98%, matching the performance of task-specific solvers and significantly outperforming prior solutions such as DeepCubeA (60.3%) and EfficientCube (69.6%). Additionally, our solution is more than 26 times faster in solving 3x3x3 Rubik's cubes while requiring up to 18.5 times less model training time than the most efficient state-of-the-art competitor.
Pushing the Limits of the Reactive Affine Shaker Algorithm to Higher Dimensions
Battiti, Roberto, Brunato, Mauro
Bayesian Optimization (BO) for the minimization of expensive functions of continuous variables uses all the knowledge acquired from previous samples (${\boldsymbol x}_i$ and $f({\boldsymbol x}_i)$ values) to build a surrogate model based on Gaussian processes. The surrogate is then exploited to define the next point to sample, through a careful balance of exploration and exploitation. Initially intended for low-dimensional spaces, BO has recently been modified and used also for very large-dimensional spaces (up to about one thousand dimensions). In this paper we consider a much simpler algorithm, called "Reactive Affine Shaker" (RAS). The next sample is always generated with a uniform probability distribution inside a parallelepiped (the "box"). At each iteration, the form of the box is adapted during the search through an affine transformation, based only on the point $\boldsymbol x$ position and on the success or failure in improving the function. The function values are therefore not used directly to modify the search area and to generate the next sample. The entire dimensionality is kept (no active subspaces). Despite its extreme simplicity and its use of only stochastic local search, surprisingly the produced results are comparable to and not too far from the state-of-the-art results of high-dimensional versions of BO, although with some more function evaluations. An ablation study and an analysis of probability distribution of directions (improving steps and prevailing box orientation) in very large-dimensional spaces are conducted to understand more about the behavior of RAS and to assess the relative importance of the algorithmic building blocks for the final results.