AITopics

doi: 10.34737/qq4w7

2509.22704

Country:

Europe (0.92)
Asia (0.67)
North America > United States > California (0.27)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Information Technology > Software (1.00)
Information Technology > Services (1.00)
Energy > Power Industry (1.00)
(6 more...)

Technology:

Information Technology > Virtualization (1.00)
Information Technology > Software > Programming Languages (1.00)
Information Technology > Information Management > Search (1.00)
(13 more...)

arXiv.org Artificial IntelligenceSep-30-2025

Learning to Segment for Vehicle Routing Problems

Ouyang, Wenbin, Li, Sirui, Ma, Yining, Wu, Cathy

Iterative heuristics are widely recognized as state-of-the-art for Vehicle Routing Problems (VRPs). In this work, we exploit a critical observation: a large portion of the solution remains stable, i.e., unchanged across search iterations, causing redundant computations, especially for large-scale VRPs with long subtours. To address this, we pioneer the formal study of the First-Segment-Then-Aggregate (FSTA) decomposition technique to accelerate iterative solvers. FSTA preserves stable solution segments during the search, aggregates nodes within each segment into fixed hypernodes, and focuses the search only on unstable portions. Yet, a key challenge lies in identifying which segments should be aggregated. To this end, we introduce Learning-to-Segment (L2Seg), a novel neural framework to intelligently differentiate potentially stable and unstable portions for FSTA decomposition. We present three L2Seg variants: non-autoregressive (globally comprehensive but locally indiscriminate), autoregressive (locally refined but globally deficient), and their synergy. Empirical results on CVRP and VRPTW show that L2Seg accelerates state-of-the-art solvers by 2x to 7x. We further provide in-depth analysis showing why synergy achieves the best performance. Notably, L2Seg is compatible with traditional, learning-based, and hybrid solvers, while supporting various VRPs.

artificial intelligence, machine learning, natural language, (16 more...)

2507.01037

Genre: Research Report > New Finding (0.45)

Industry: Transportation > Freight & Logistics Services (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.68)
(2 more...)

Neural Information Processing SystemsSep-29-2025, 14:12:40 GMT

d5c56ec4f69c9a473089b16000d3f8cd-Paper-Conference.pdf

invariant, large language model, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.28)
North America > United States (0.28)
Europe > Switzerland (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.45)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.93)
(3 more...)

Neural Information Processing SystemsSep-29-2025, 10:21:19 GMT

ca5520b5672ea120b23bde75c46e76c6-Supplemental.pdf

artificial intelligence, configuration, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Alberta (0.14)
Europe > United Kingdom > England (0.14)

Industry: Energy > Oil & Gas (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Heidari, Alireza, Zhang, Wei, Xiong, Ying

FusedANN: Convexified Hybrid ANN via Attribute-Vector Fusion

Vector search powers transformers technology, but real-world use demands hybrid queries that combine vector similarity with attribute filters (e.g., "top document in category X, from 2023"). Current solutions trade off recall, speed, and flexibility, relying on fragile index hacks that don't scale. We introduce FusedANN (Fused Attribute-Vector Nearest Neighbor), a geometric framework that elevates filtering to ANN optimization constraints and introduces a convex fused space via a Lagrangian-like relaxation. Our method jointly embeds attributes and vectors through transformer-based convexification, turning hard filters into continuous, weighted penalties that preserve top-k semantics while enabling efficient approximate search. We prove that FusedANN reduces to exact filtering under high selectivity, gracefully relaxes to semantically nearest attributes when exact matches are insufficient, and preserves downstream ANN alpha-approximation guarantees. Empirically, FusedANN improves query throughput by eliminating brittle filtering stages, achieving superior recall-latency tradeoffs on standard hybrid benchmarks without specialized index hacks, delivering up to 3 times higher throughput and better recall than state-of-the-art hybrid and graph-based systems. Theoretically, we provide explicit error bounds and parameter selection rules that make FusedANN practical for production. This establishes a principled, scalable, and verifiable bridge between symbolic constraints and vector similarity, unlocking a new generation of filtered retrieval systems for large, hybrid, and dynamic NLP/ML workloads.

information retrieval, machine learning, natural language, (20 more...)

2509.19767

Country: North America > United States (0.67)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.93)

Rydin, Filip, Lischka, Attila, Wu, Jiaming, Chehreghani, Morteza Haghir, Kulcsár, Balázs

Beyond Simple Graphs: Neural Multi-Objective Routing on Multigraphs

Learning-based methods for routing have gained significant attention in recent years, both in single-objective and multi-objective contexts. Y et, existing methods are unsuitable for routing on multigraphs, which feature multiple edges with distinct attributes between node pairs, despite their strong relevance in real-world scenarios. In this paper, we propose two graph neural network-based methods to address multi-objective routing on multigraphs. Our first approach operates directly on the multigraph by autoregressively selecting edges until a tour is completed. The second model, which is more scalable, first simplifies the multigraph via a learned pruning strategy and then performs autoregressive routing on the resulting simple graph. The field of neural combinatorial optimization has grown significantly in recent years and vehicle routing problems in particular have attracted much attention (Zhou et al., 2024a). While early works focused on the Traveling Salesman Problem (TSP) (Vinyals et al., 2015; Bello et al., 2017), new learning-based methods for vehicle routing can solve a wide range of problems efficiently and effectively, often surpassing classical state-of-the-art heuristics (Zhou et al., 2024b; Drakulic et al., 2025). Y et, existing methods have one limitation in common: they assume problems are defined on simple graphs. However, multigraph formulations, featuring several edges between each node pair, become relevant as soon as there are competing edges that cannot be chosen between a priori. Such situations typically occur when edges have more than one feature of interest, such as both travel time and distance. In spite of the high practical relevance of multigraph formulations (Lai et al., 2016; Ben Ticha et al., 2017), current learning-based methods are incapable of handling them due to two main reasons. Firstly, many state-of-the-art neural solvers rely on transformers to encode the problem instance. While these work well in the Euclidean setting (Kool et al., 2018) and with some modifications on asymmetric, directed graphs (Kwon et al., 2021), they lack the capability to encode multigraph structures. Secondly and more importantly, planning routes in multigraphs requires both selecting the node order and which edges to traverse, making current decoding strategies unsuitable. In this work, we aim to bridge the gap between learning-based methods for routing and accurate network representations given by multigraphs. We focus on the Multi-Objective (MO) setting, as several competing objectives naturally translates to several competing edges between each node pair. Nevertheless, our methods are general and can easily be extended to single-objective settings. Our code will be released publicly upon paper acceptance. To our knowledge, we present the first neural solvers designed for such structures.

artificial intelligence, gms-dh, machine learning, (17 more...)

2506.22095

Genre: Research Report (1.00)

Industry: Transportation > Freight & Logistics Services (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Huang, Junquan, Chen, Zong-Gan, Jiang, Yuncheng, Zhan, Zhi-Hui

RsGCN: Subgraph-Based Rescaling Enhances Generalization of GCNs for Solving Traveling Salesman Problems

GCN-based traveling salesman problem (TSP) solvers face two critical challenges: poor cross-scale generalization for TSPs and high training costs. To address these challenges, we propose a Subgraph-Based Rescaling Graph Convolu-tional Network (RsGCN). Focusing on the scale-dependent features (i.e., features varied with problem scales) related to nodes and edges, we design the subgraph-based rescaling to normalize edge lengths of subgraphs. Under a unified subgraph perspective, RsGCN can efficiently learn scale-generalizable representations from small-scale TSPs at low cost. To exploit and assess the heatmaps generated by Rs-GCN, we design a Reconstruction-Based Search (RBS), in which a reconstruction process based on adaptive weight is incorporated to help avoid local optima. Based on a combined architecture of RsGCN and RBS, our solver achieves remarkable generalization and low training cost: with only 3 epochs of training on a mixed-scale dataset containing instances with up to 100 nodes, it can be generalized successfully to 10K-node instances without any fine-tuning. Extensive experiments demonstrate our advanced performance across uniform-distribution instances of 9 different scales from 20 to 10K nodes and 78 real-world instances from TSPLIB, while requiring the fewest learnable parameters and training epochs among neural competitors. Background The traveling salesman problem (TSP), as a typical combinatorial optimization problem, has been extensively studied in the literature.

artificial intelligence, machine learning, node, (15 more...)

2506.00533

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.91)

Latent Veracity Inference for Identifying Errors in Stepwise Reasoning

Kim, Minsu, Falet, Jean-Pierre, Richardson, Oliver E., Chen, Xiaoyin, Jain, Moksh, Ahn, Sungjin, Ahn, Sungsoo, Bengio, Yoshua

Chain-of-Thought (CoT) reasoning has advanced the capabilities and transparency of language models (LMs); however, reasoning chains can contain inaccurate statements that reduce performance and trustworthiness. To address this, we propose to augment each reasoning step in a CoT with a latent veracity (or correctness) variable. To efficiently explore this expanded space, we introduce Veracity Search (VS), a discrete search algorithm over veracity assignments. It performs otherwise intractable inference in the posterior distribution over latent veracity values by leveraging the LM's joint likelihood over veracity and the final answer as a proxy reward. This efficient inference-time verification method facilitates supervised fine-tuning of an Amortized Veracity Inference (AVI) machine by providing pseudo-labels for veracity. AVI generalizes VS, enabling accurate zero-shot veracity inference in novel contexts. Empirical results demonstrate that VS reliably identifies errors in logical (ProntoQA), mathematical (GSM8K), and commonsense (CommonsenseQA) reasoning benchmarks, with AVI achieving comparable zero-shot accuracy. Finally, we demonstrate the utility of latent veracity inference for providing feedback during self-correction and self-improvement.

large language model, machine learning, natural language, (16 more...)

2505.11824

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
(2 more...)

VLA-Reasoner: Empowering Vision-Language-Action Models with Reasoning via Online Monte Carlo Tree Search

Guo, Wenkai, Lu, Guanxing, Deng, Haoyuan, Wu, Zhenyu, Tang, Yansong, Wang, Ziwei

Vision-Language-Action models (VLAs) achieve strong performance in general robotic manipulation tasks by scaling imitation learning. However, existing VLAs are limited to predicting short-sighted next-action, which struggle with long-horizon trajectory tasks due to incremental deviations. To address this problem, we propose a plug-in framework named VLA-Reasoner that effectively empowers off-the-shelf VLAs with the capability of foreseeing future states via test-time scaling. Specifically, VLA-Reasoner samples and rolls out possible action trajectories where involved actions are rationales to generate future states via a world model, which enables VLA-Reasoner to foresee and reason potential outcomes and search for the optimal actions. We further leverage Monte Carlo Tree Search (MCTS) to improve search efficiency in large action spaces, where stepwise VLA predictions seed the root. Meanwhile, we introduce a confidence sampling mechanism based on Kernel Density Estimation (KDE), to enable efficient exploration in MCTS without redundant VLA queries. We evaluate intermediate states in MCTS via an offline reward shaping strategy, to score predicted futures and correct deviations with long-term feedback. We conducted extensive experiments in both simulators and the real world, demonstrating that our proposed VLA-Reasoner achieves significant improvements over the state-of-the-art VLAs. Our method highlights a potential pathway toward scalable test-time computation of robotic manipulation.

artificial intelligence, arxiv preprint arxiv, planning & scheduling, (14 more...)

2509.22643

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.86)

Futuhi, Ehsan, Sturtevant, Nathan R.

Learning Admissible Heuristics for A*: Theory and Practice

Heuristic functions are central to the performance of search algorithms such as A-star, where admissibility - the property of never overestimating the true shortest-path cost - guarantees solution optimality. Recent deep learning approaches often disregard admissibility and provide limited guarantees on generalization beyond the training data. This paper addresses both of these limitations. First, we pose heuristic learning as a constrained optimization problem and introduce Cross-Entropy Admissibility (CEA), a loss function that enforces admissibility during training. On the Rubik's Cube domain, this method yields near-admissible heuristics with significantly stronger guidance than compressed pattern database (PDB) heuristics. Theoretically, we study the sample complexity of learning heuristics. By leveraging PDB abstractions and the structural properties of graphs such as the Rubik's Cube, we tighten the bound on the number of training samples needed for A-star to generalize. Replacing a general hypothesis class with a ReLU neural network gives bounds that depend primarily on the network's width and depth, rather than on graph size. Using the same network, we also provide the first generalization guarantees for goal-dependent heuristics.

artificial intelligence, heuristic function, machine learning, (18 more...)

2509.22626

Country:

North America (0.46)
Oceania > Australia (0.28)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Rubik's Cube (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)