AITopics

2412.15797

Country:

Asia (0.93)
North America > Mexico (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Outcome-Refining Process Supervision for Code Generation

Yu, Zhuohao, Gu, Weizheng, Wang, Yidong, Zeng, Zhengran, Wang, Jindong, Ye, Wei, Zhang, Shikun

Large Language Models have demonstrated remarkable capabilities in code generation, yet they often struggle with complex programming tasks that require deep algorithmic reasoning. While process supervision through learned reward models shows promise in guiding reasoning steps, it requires expensive training data and suffers from unreliable evaluation. We propose Outcome-Refining Process Supervision, a novel paradigm that treats outcome refinement itself as the process to be supervised. Our framework leverages concrete execution signals to ground the supervision of reasoning steps, while using tree-structured exploration to maintain multiple solution trajectories simultaneously. Experiments demonstrate that our approach enables even smaller models to achieve high success accuracy and performance metrics on competitive programming tasks, creates more reliable verification than traditional reward models without requiring training PRMs. Our approach achieves significant improvements across 5 models and 3 datasets: an average of 26.9% increase in correctness and 42.2% in efficiency. The results suggest that providing structured reasoning space with concrete verification signals is crucial for solving complex programming tasks. We open-source all our code and data at: https://github.com/zhuohaoyu/ORPS

large language model, machine learning, preprint arxiv, (20 more...)

2412.15118

Country: Asia (0.67)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas > Downstream (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Sarkar, Anindya, DiChristofano, Alex, Das, Sanmay, Fowler, Patrick J., Jacobs, Nathan, Vorobeychik, Yevgeniy

Active Geospatial Search for Efficient Tenant Eviction Outreach

Tenant evictions threaten housing stability and are a major concern for many cities. An open question concerns whether data-driven methods enhance outreach programs that target at-risk tenants to mitigate their risk of eviction. We propose a novel active geospatial search (AGS) modeling framework for this problem. AGS integrates property-level information in a search policy that identifies a sequence of rental units to canvas to both determine their eviction risk and provide support if needed. We propose a hierarchical reinforcement learning approach to learn a search policy for AGS that scales to large urban areas containing thousands of parcels, balancing exploration and exploitation and accounting for travel costs and a budget constraint. Crucially, the search policy adapts online to newly discovered information about evictions. Evaluation using eviction data for a large urban area demonstrates that the proposed framework and algorithmic approach are considerably more effective at sequentially identifying eviction cases than baseline methods.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2412.17854

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Banking & Finance > Real Estate (1.00)
Government > Regional Government > North America Government > United States Government (0.46)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.46)

Dat, Pham Vu Tuan, Doan, Long, Binh, Huynh Thi Thanh

HSEvo: Elevating Automatic Heuristic Design with Diversity-Driven Harmony Search and Genetic Algorithm Using LLMs

Automatic Heuristic Design (AHD) is an active research area due to its utility in solving complex search and NP-hard combinatorial optimization problems in the real world. The recent advancements in Large Language Models (LLMs) introduce new possibilities by coupling LLMs with evolutionary computation to automatically generate heuristics, known as LLM-based Evolutionary Program Search (LLM-EPS). While previous LLM-EPS studies obtained great performance on various tasks, there is still a gap in understanding the properties of heuristic search spaces and achieving a balance between exploration and exploitation, which is a critical factor in large heuristic search spaces. In this study, we address this gap by proposing two diversity measurement metrics and perform an analysis on previous LLM-EPS approaches, including FunSearch, EoH, and ReEvo. Results on black-box AHD problems reveal that while EoH demonstrates higher diversity than FunSearch and ReEvo, its objective score is unstable. Conversely, ReEvo's reflection mechanism yields good objective scores but fails to optimize diversity effectively. With this finding in mind, we introduce HSEvo, an adaptive LLM-EPS framework that maintains a balance between diversity and convergence with a harmony search algorithm. Through experimentation, we find that HSEvo achieved high diversity indices and good objective scores while remaining cost-effective. These results underscore the importance of balancing exploration and exploitation and understanding heuristic search spaces in designing frameworks in LLM-EPS.

artificial intelligence, large language model, natural language, (16 more...)

2412.14995

Country:

Asia > Vietnam > Hanoi > Hanoi (0.04)
North America > United States > Virginia (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Transportation (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Progressive Multimodal Reasoning via Active Retrieval

Dong, Guanting, Zhang, Chenghao, Deng, Mengjie, Zhu, Yutao, Dou, Zhicheng, Wen, Ji-Rong

Multi-step multimodal reasoning tasks pose significant challenges for multimodal large language models (MLLMs), and finding effective ways to enhance their performance in such scenarios remains an unresolved issue. In this paper, we propose AR-MCTS, a universal framework designed to progressively improve the reasoning capabilities of MLLMs through Active Retrieval (AR) and Monte Carlo Tree Search (MCTS). Our approach begins with the development of a unified retrieval module that retrieves key supporting insights for solving complex reasoning problems from a hybrid-modal retrieval corpus. To bridge the gap in automated multimodal reasoning verification, we employ the MCTS algorithm combined with an active retrieval mechanism, which enables the automatic generation of step-wise annotations. This strategy dynamically retrieves key insights for each reasoning step, moving beyond traditional beam search sampling to improve the diversity and reliability of the reasoning space. Additionally, we introduce a process reward model that aligns progressively to support the automatic verification of multimodal reasoning tasks. Experimental results across three complex multimodal reasoning benchmarks confirm the effectiveness of the AR-MCTS framework in enhancing the performance of various multimodal models. Further analysis demonstrates that AR-MCTS can optimize sampling diversity and accuracy, yielding reliable multimodal reasoning.

large language model, machine learning, natural language, (20 more...)

2412.14835

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
(12 more...)

Genre:

Workflow (0.93)
Research Report > New Finding (0.45)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.87)

Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling

Li, Junyi, Ng, Hwee Tou

Despite their outstanding capabilities, large language models (LLMs) are prone to hallucination and producing factually incorrect information. This challenge has spurred efforts in attributed text generation, which prompts LLMs to generate content with supporting evidence. In this paper, we propose a novel framework, called Think&Cite, and formulate attributed text generation as a multi-step reasoning problem integrated with search. Specifically, we propose Self-Guided Monte Carlo Tree Search (SG-MCTS), which capitalizes on the self-reflection capability of LLMs to reflect on the intermediate states of MCTS for guiding the tree expansion process. To provide reliable and comprehensive feedback, we introduce Progress Reward Models to measure the progress of tree search from the root to the current state from two aspects, i.e., generation and attribution progress. We conduct extensive experiments on three datasets and the results show that our approach significantly outperforms baseline approaches.

large language model, machine learning, natural language, (18 more...)

2412.1486

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Texas (0.14)
Asia > Singapore (0.04)
(15 more...)

Genre: Research Report (0.84)

Industry:

Leisure & Entertainment > Sports > Football (0.94)
Health & Medicine (0.68)
Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Feng, Shengyu, Yang, Yiming

SORREL: Suboptimal-Demonstration-Guided Reinforcement Learning for Learning to Branch

Mixed Integer Linear Program (MILP) solvers are mostly built upon a Branch-and-Bound (B\&B) algorithm, where the efficiency of traditional solvers heavily depends on hand-crafted heuristics for branching. The past few years have witnessed the increasing popularity of data-driven approaches to automatically learn these heuristics. However, the success of these methods is highly dependent on the availability of high-quality demonstrations, which requires either the development of near-optimal heuristics or a time-consuming sampling process. This paper averts this challenge by proposing Suboptimal-Demonstration-Guided Reinforcement Learning (SORREL) for learning to branch. SORREL selectively learns from suboptimal demonstrations based on value estimation. It utilizes suboptimal demonstrations through both offline reinforcement learning on the demonstrations generated by suboptimal heuristics and self-imitation learning on past good experiences sampled by itself. Our experiments demonstrate its advanced performance in both branching quality and training efficiency over previous methods for various MILPs.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2412.15534

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Tabletop Object Rearrangement: Structure, Complexity, and Efficient Combinatorial Search-Based Solutions

Gao, Kai

This thesis aims to provide a complete structural analysis and efficient algorithmic solutions to tabletop object rearrangement with overhand grasps (TORO). This problem captures a common task that we solve on a daily basis and is essential in enabling truly intelligent robotic manipulation. When rearranging many objects in a confined workspace, on the one hand, action sequencing with the least pick-n-places in TORO is NP-hard[han2018complexity]; on the other hand, temporarily relocating objects to some free space ("buffer poses") may be necessary but highly challenging in a cluttered environment. Focusing on these two challenges, the thesis covers TORO in four different setups, including varied workspace assumptions (with/without external buffers) and manipulator settings (single/dual-arms or a mobile manipulator). The thesis first explores TORO with external buffers (TORE), addressing the size of needed space for temporary object relocation ("running buffers"). This study shows that finding the maximum running buffers (MRB) is NP-hard and that MRB can grow unbounded with an increasing number of objects, even with uniform shapes. Exact algorithms developed for both labeled and unlabeled settings can scale to over 100 objects. The thesis further extends the TORE algorithms to tabletop rearrangement with internal buffers (TORI), where all temporary object placements need to be inside the workspace.

artificial intelligence, buffer, machine learning, (18 more...)

2412.15398

Country: North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)

Genre: Research Report > Promising Solution (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Enhancing Large-scale UAV Route Planing with Global and Local Features via Reinforcement Graph Fusion

Zhou, Tao, Ye, Kai, Shi, Zeyu, Lin, Jiajing, Xu, Dejun, Jiang, Min

Numerous remarkable advancements have been made in accuracy, speed, and parallelism for solving the Unmanned Aerial Vehicle Route Planing (UAVRP). However, existing UAVRP solvers face challenges when attempting to scale effectively and efficiently for larger instances. In this paper, we present a generalization framework that enables current UAVRP solvers to robustly extend their capabilities to larger instances, accommodating up to 10,000 points, using widely recognized test sets. The UAVRP under a large number of patrol points is a typical large-scale TSP problem.Our proposed framework comprises three distinct steps. Firstly, we employ Delaunay triangulation to extract subgraphs from large instances while preserving global features. Secondly, we utilize an embedded TSP solver to obtain sub-results, followed by graph fusion. Finally, we implement a decoding strategy customizable to the user's requirements, resulting in high-quality solutions, complemented by a warming-up process for the heatmap. To demonstrate the flexibility of our approach, we integrate two representative TSP solvers into our framework and conduct a comprehensive comparative analysis against existing algorithms using large TSP benchmark datasets. The results unequivocally demonstrate that our framework efficiently scales existing TSP solvers to handle large instances and consistently outperforms state-of-the-art (SOTA) methods. Furthermore, since our proposed framework does not necessitate additional training or fine-tuning, we believe that its generality can significantly advance research on end-to-end UAVRP solvers, enabling the application of a broader range of methods to real-world scenarios.

artificial intelligence, machine learning, solver, (15 more...)

2412.15537

Country: Asia > China (0.30)

Genre: Research Report (1.00)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.34)

Herynek, Jáchym, Edelkamp, Stefan

Heuristic Planner for Communication-Constrained Multi-Agent Multi-Goal Path Planning

arXiv.org Artificial IntelligenceDec-18-2024

Abstract-- In robotics, coordinating a group of robots is an essential task. This work presents the communicationconstrained multi-agent multi-goal path planning problem and proposes a graph-search based algorithm to address this task. Given a fleet of robots, an environment represented by a weighted graph, and a sequence of goals, the aim is to visit all the goals without breaking the communication constraints between the agents, minimizing the completion time. While the red agent visits the first goal, the other two agents position themselves favorably with respect I. As long as the communication remains are many ways the agents might interact with each other unbroken, the whole system works as if each of the robots and their environment, and there are many limitations one had access to the computational power of the arbiter. This work is motivated by the constraint of As a similar example, imagine a mother ship-style drone limited communication distance. It establishes the problem that sends out small drones.

agent, artificial intelligence, planning & scheduling, (19 more...)

2412.13719

Country: Europe > Czechia > Prague (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.34)