AITopics | Gong, Hailei

Collaborating Authors

Gong, Hailei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models

Wang, Teng, Jiang, Zhangyi, He, Zhenqi, Yang, Wenhan, Zheng, Yanan, Li, Zeyu, He, Zifan, Tong, Shenyang, Gong, Hailei

arXiv.org Artificial IntelligenceMar-19-2025

Recent studies show that Large Language Models (LLMs) achieve strong reasoning capabilities through supervised fine-tuning or reinforcement learning. However, a key approach, the Process Reward Model (PRM), suffers from reward hacking, making it unreliable in identifying the best intermediate step. In this paper, we propose a novel reward model approach, Hierarchical Reward Model (HRM), which evaluates both individual and consecutive reasoning steps from fine-grained and coarse-grained level. HRM excels in evaluating multi-step reasoning coherence and self-reflection, especially in scenarios where the previous reasoning step is incorrect but the subsequent step successfully identifies and corrects the error. Furthermore, to address the inefficiency of autonomous annotating PRM training data via Monte Carlo Tree Search (MCTS), we introduce a lightweight and effective data augmentation strategy called Hierarchical Node Compression (HNC) based on node merging (combining two consecutive reasoning steps into one step) in the tree structure. By applying HNC to MCTS-generated reasoning trajectories, we enhance the diversity and robustness of HRM training data while introducing controlled noise with minimal computational overhead. Empirical results on the PRM800K dataset demonstrate that HRM, in conjunction with HNC, achieves superior stability and reliability in evaluation compared to PRM. Furthermore, cross-domain evaluations on MATH500 and GSM8K dataset confirm HRM's superior generalization and robustness across diverse reasoning tasks.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.13551

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Decision Information Meets Large Language Models: The Future of Explainable Operations Research

Zhang, Yansen, Kang, Qingcan, Yu, Wing Yin, Gong, Hailei, Fu, Xiaojin, Han, Xiongwei, Zhong, Tao, Ma, Chen

arXiv.org Artificial IntelligenceFeb-14-2025

Operations Research (OR) is vital for decision-making in many industries. While recent OR methods have seen significant improvements in automation and efficiency through integrating Large Language Models (LLMs), they still struggle to produce meaningful explanations. This lack of clarity raises concerns about transparency and trustworthiness in OR applications. To address these challenges, we propose a comprehensive framework, Explainable Operations Research (EOR), emphasizing actionable and understandable explanations accompanying optimization. The core of EOR is the concept of Decision Information, which emerges from what-if analysis and focuses on evaluating the impact of complex constraints (or parameters) changes on decision-making. Specifically, we utilize bipartite graphs to quantify the changes in the OR model and adopt LLMs to improve the explanation capabilities. Additionally, we introduce the first industrial benchmark to rigorously evaluate the effectiveness of explanations and analyses in OR, establishing a new standard for transparency and clarity in the field.

explanation, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.09994

Country: Europe (0.28)

Genre: Research Report (1.00)

Industry:

Transportation > Air (0.70)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)

Add feedback

BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving

Wang, Teng, Yu, Wing-Yin, He, Zhenqi, Liu, Zehua, Han, Xiongwei, Gong, Hailei, Wu, Han, Shi, Wei, She, Ruifeng, Zhu, Fangzhou, Zhong, Tao

arXiv.org Artificial IntelligenceDec-3-2024

LLMs exhibit advanced reasoning capabilities, offering the potential to transform natural language questions into mathematical models. However, existing open-source datasets in operations research domain lack detailed annotations of the modeling process, such as variable definitions, focusing solely on objective values, which hinders reinforcement learning applications. To address this, we release the StructuredOR dataset, annotated with comprehensive labels that capture the complete mathematical modeling process. We further propose BPP-Search, a algorithm that integrates reinforcement learning into a tree-of-thought structure using Beam search, a Process reward model, and a pairwise Preference algorithm. This approach enables efficient exploration of tree structures, avoiding exhaustive search while improving accuracy. Extensive experiments on StructuredOR, NL4OPT, and MAMO-ComplexLP datasets show that BPP-Search significantly outperforms state-of-the-art methods. In tree-based reasoning, BPP-Search excels in accuracy and efficiency, enabling faster retrieval of correct solutions.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2411.17404

Country:

Asia > China (0.28)
Asia > Middle East > Oman (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback