thought graph
ARIES: Autonomous Reasoning with LLMs on Interactive Thought Graph Environments
Gimenes, Pedro, Cao, Zeyu, Wong, Jeffrey, Zhao, Yiren
Recent research has shown that LLM performance on reasoning tasks can be enhanced by scaling test-time compute. One promising approach, particularly with decomposable problems, involves arranging intermediate solutions as a graph on which transformations are performed to explore the solution space. However, prior works rely on pre-determined, task-specific transformation schedules which are subject to a set of searched hyperparame-ters. In this work, we view thought graph transformations as actions in a Markov decision process, and implement policy agents to drive effective action policies for the underlying reasoning LLM agent. In particular, we investigate the ability for another LLM to act as a policy agent on thought graph environments and introduce ARIES, a multi-agent architecture for reasoning with LLMs. In ARIES, reasoning LLM agents solve decomposed subproblems, while policy LLM agents maintain visibility of the thought graph states, and dynamically adapt the problem-solving strategy. Through extensive experiments, we observe that using off-the-shelf LLMs as policy agents with no supervised fine-tuning (SFT) can yield up to 29% higher accuracy on HumanEval relative to static transformation schedules, as well as reducing inference costs by 35% and avoid any search requirements. We also conduct a thorough analysis of observed failure modes, highlighting that limitations on LLM sizes and the depth of problem decomposition can be seen as challenges to scaling LLM-guided reasoning. Prior works have shown that Large Language Models (LLMs) are subject to the emergence of abilities as their parameter count grows (Wei et al., 2022), which spurred significant interest in training increasingly larger models. However, recent work showed that under a fixed compute budget for training and inference, LLM performance on reasoning tasks can be enhanced by allocating a higher proportion of compute to inference rather than training (Snell et al., 2024). This shift towards inference-time compute scaling can be intuitively understood through the Dual Process Theory, which postulates the existence of two distinct modes of reasoning in humans - (1) a fast, intuitive mode and (2) a slow, deliberate mode (Evans & Frankish, 2009).
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Promising Solution (0.48)
- Research Report > New Finding (0.46)
GraphIC: A Graph-Based In-Context Example Retrieval Model for Multi-Step Reasoning
Fu, Jiale, Wang, Yaqing, Han, Simeng, Fan, Jiaming, Si, Chen, Yang, Xu
In-context learning (ICL) enables large language models (LLMs) to generalize to new tasks by incorporating a few in-context examples (ICEs) directly in the input, without updating parameters. However, the effectiveness of ICL heavily relies on the selection of ICEs, and conventional text-based embedding methods are often inadequate for tasks that require multi-step reasoning, such as mathematical and logical problem solving. This is due to the bias introduced by shallow semantic similarities that fail to capture the deeper reasoning structures required for these tasks. We present GraphIC, a novel approach that leverages graph-based representations of reasoning processes, coupled with Bayesian Networks (BNs) to select ICEs. Importantly, BNs capture the dependency of a node's attributes on its parent nodes, closely mirroring the hierarchical nature of human cognition--where each thought is shaped by preceding ones. This makes BNs particularly well-suited for multi-step reasoning tasks, aligning the process more closely with human-like reasoning. Extensive experiments across three types of reasoning tasks (mathematical reasoning, code generation, and logical reasoning) demonstrate that GraphIC outperforms both training-free and training-based models in selecting ICEs, excelling in terms of both effectiveness and efficiency. We show that GraphIC enhances ICL's performance and interpretability, significantly advancing ICE selection for multi-step reasoning tasks. In-context learning (ICL) (Brown et al., 2020) represents a paradigm in how large language models (LLMs) perform inference by using a small number of in-context examples (ICEs) within the input prompt. This technique enables LLMs to generalize to new tasks or enhance their performance on existing tasks without updating parameters. However, previous studies have highlighted the sensitivity of ICL performance to the specific ICEs selected (Zhao et al., 2021; Liu et al., 2022), underscoring the importance of strategic ICE selection. Consequently, numerous methods have been proposed to optimize the selection of ICEs, focusing on improving task performance and ensuring greater robustness (Liu et al., 2022; Rubin et al., 2022; Ye et al., 2023; Gupta et al., 2024).
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Asia > China > Guangxi Province > Nanning (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.88)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
Thought Graph: Generating Thought Process for Biological Reasoning
Hsu, Chi-Yang, Cox, Kyle, Xu, Jiawei, Tan, Zhen, Zhai, Tianhua, Hu, Mengzhou, Pratt, Dexter, Chen, Tianlong, Hu, Ziniu, Ding, Ying
We present the Thought Graph as a novel framework to support complex reasoning and use gene set analysis as an example to uncover semantic relationships between biological processes. Our framework stands out for its ability to provide a deeper understanding of gene sets, significantly surpassing GSEA by 40.28% and LLM baselines by 5.38% based on cosine similarity to human annotations. Our analysis further provides insights into future directions of biological processes naming, and implications for bioinformatics and precision medicine.
- North America > United States > Texas > Travis County > Austin (0.16)
- Asia > Singapore > Central Region > Singapore (0.05)
- North America > United States > California > San Diego County > San Diego (0.05)
- (6 more...)