Goto

Collaborating Authors

 Search


M-Predictive Spliner: Enabling Spatiotemporal Multi-Opponent Overtaking for Autonomous Racing

arXiv.org Artificial Intelligence

Unrestricted multi-agent racing presents a significant research challenge, requiring decision-making at the limits of a robot's operational capabilities. While previous approaches have either ignored spatiotemporal information in the decision-making process or been restricted to single-opponent scenarios, this work enables arbitrary multi-opponent head-to-head racing while considering the opponents' future intent. The proposed method employs a KF-based multi-opponent tracker to effectively perform opponent ReID by associating them across observations. Simultaneously, spatial and velocity GPR is performed on all observed opponent trajectories, providing predictive information to compute the overtaking maneuvers. This approach has been experimentally validated on a physical 1:10 scale autonomous racing car, achieving an overtaking success rate of up to 91.65% and demonstrating an average 10.13%-point improvement in safety at the same speed as the previous SotA. These results highlight its potential for high-performance autonomous racing.


Deep Reinforcement Learning Xiangqi Player with Monte Carlo Tree Search

arXiv.org Artificial Intelligence

This paper presents a Deep Reinforcement Learning (DRL) system for Xiangqi (Chinese Chess) that integrates neural networks with Monte Carlo Tree Search (MCTS) to enable strategic self-play and self-improvement. Addressing the underexplored complexity of Xiangqi--including its unique board layout, piece movement constraints, and victory conditions--our approach combines policy-value networks with MCTS to simulate move consequences and refine decision-making.By overcoming challenges such as Xiangqi's high branching factor and asymmetrical piece dynamics, our work advances AI capabilities in culturally significant strategy games while providing insights for adapting DRL-MCTS frameworks to domain-specific rule systems.


Advancing Stochastic 3-SAT Solvers by Dissipating Oversatisfied Constraints

arXiv.org Artificial Intelligence

We introduce and benchmark a stochastic local search heuristic for the NP-complete satisfiability problem 3-SAT that drastically outperforms existing solvers in the notoriously difficult realm of critically hard instances. Our construction is based on the crucial observation that well established previous approaches such as WalkSAT are prone to get stuck in local minima that are distinguished from true solutions by a larger number of oversatisfied combinatorial constraints. To address this issue, the proposed algorithm, coined DOCSAT, dissipates oversatisfied constraints (DOC), i.e. reduces their unfavorable abundance so as to render them critical. We analyze and benchmark our algorithm on a randomly generated sample of hard but satisfiable 3-SAT instances with varying problem sizes up to N=15000. Quite remarkably, we find that DOCSAT outperforms both WalkSAT and other well known algorithms including the complete solver Kissat, even when comparing its ability to solve the hardest quintile of the sample to the average performance of its competitors. The essence of DOCSAT may be seen as a way of harnessing statistical structure beyond the primary cost function of a combinatorial problem to avoid or escape local minima traps in stochastic local search, which opens avenues for generalization to other optimization problems.


On Training-Test (Mis)alignment in Unsupervised Combinatorial Optimization: Observation, Empirical Exploration, and Analysis

arXiv.org Artificial Intelligence

In unsupervised combinatorial optimization (UCO), during training, one aims to have continuous decisions that are promising in a probabilistic sense for each training instance, which enables end-to-end training on initially discrete and non-differentiable problems. At the test time, for each test instance, starting from continuous decisions, derandomization is typically applied to obtain the final deterministic decisions. Researchers have developed more and more powerful test-time derandomization schemes to enhance the empirical performance and the theoretical guarantee of UCO methods. However, we notice a misalignment between training and testing in the existing UCO methods. Consequently, lower training losses do not necessarily entail better post-derandomization performance, even for the training instances without any data distribution shift. Empirically, we indeed observe such undesirable cases. We explore a preliminary idea to better align training and testing in UCO by including a differentiable version of derandomization into training. Our empirical exploration shows that such an idea indeed improves training-test alignment, but also introduces nontrivial challenges into training.


EFormer: An Effective Edge-based Transformer for Vehicle Routing Problems

arXiv.org Artificial Intelligence

Recent neural heuristics for the Vehicle Routing Problem (VRP) primarily rely on node coordinates as input, which may be less effective in practical scenarios where real cost metrics-such as edge-based distances-are more relevant. To address this limitation, we introduce EFormer, an Edge-based Transformer model that uses edge as the sole input for VRPs. Our approach employs a precoder module with a mixed-score attention mechanism to convert edge information into temporary node embeddings. We also present a parallel encoding strategy characterized by a graph encoder and a node encoder, each responsible for processing graph and node embeddings in distinct feature spaces, respectively. This design yields a more comprehensive representation of the global relationships among edges. In the decoding phase, parallel context embedding and multi-query integration are used to compute separate attention mechanisms over the two encoded embeddings, facilitating efficient path construction. We train EFormer using reinforcement learning in an autoregressive manner. Extensive experiments on the Traveling Salesman Problem (TSP) and Capacitated Vehicle Routing Problem (CVRP) reveal that EFormer outperforms established baselines on synthetic datasets, including large-scale and diverse distributions. Moreover, EFormer demonstrates strong generalization on real-world instances from TSPLib and CVRPLib. These findings confirm the effectiveness of EFormer's core design in solving VRPs.


Insights on Adversarial Attacks for Tabular Machine Learning via a Systematic Literature Review

arXiv.org Artificial Intelligence

Adversarial attacks in machine learning have been extensively reviewed in areas like computer vision and NLP, but research on tabular data remains scattered. This paper provides the first systematic literature review focused on adversarial attacks targeting tabular machine learning models. We highlight key trends, categorize attack strategies and analyze how they address practical considerations for real-world applicability. Additionally, we outline current challenges and open research questions. By offering a clear and structured overview, this review aims to guide future efforts in understanding and addressing adversarial vulnerabilities in tabular machine learning.


Learn to Vaccinate: Combining Structure Learning and Effective Vaccination for Epidemic and Outbreak Control

arXiv.org Artificial Intelligence

The Susceptible-Infected-Susceptible (SIS) model is a widely used model for the spread of information and infectious diseases, particularly non-immunizing ones, on a graph. Given a highly contagious disease, a natural question is how to best vaccinate individuals to minimize the disease's extinction time. While previous works showed that the problem of optimal vaccination is closely linked to the NP-hard Spectral Radius Minimization (SRM) problem, they assumed that the graph is known, which is often not the case in practice. In this work, we consider the problem of minimizing the extinction time of an outbreak modeled by an SIS model where the graph on which the disease spreads is unknown and only the infection states of the vertices are observed. To this end, we split the problem into two: learning the graph and determining effective vaccination strategies. We propose a novel inclusion-exclusion-based learning algorithm and, unlike previous approaches, establish its sample complexity for graph recovery. We then detail an optimal algorithm for the SRM problem and prove that its running time is polynomial in the number of vertices for graphs with bounded treewidth. This is complemented by an efficient and effective polynomial-time greedy heuristic for any graph. Finally, we present experiments on synthetic and real-world data that numerically validate our learning and vaccination algorithms.


Comparison of Innovative Strategies for the Coverage Problem: Path Planning, Search Optimization, and Applications in Underwater Robotics

arXiv.org Artificial Intelligence

In many applications, including underwater robotics, the coverage problem requires an autonomous vehicle to systematically explore a defined area while minimizing redundancy and avoiding obstacles. This paper investigates coverage path planning strategies to enhance the efficiency of underwater gliders, particularly in maximizing the probability of detecting a radioactive source while ensuring safe navigation. We evaluate three path-planning approaches: the Traveling Salesman Problem (TSP), Minimum Spanning Tree (MST), and Optimal Control Problem (OCP). Simulations were conducted in MATLAB, comparing processing time, uncovered areas, path length, and traversal time. Results indicate that OCP is preferable when traversal time is constrained, although it incurs significantly higher computational costs. Conversely, MST-based approaches provide faster but less optimal solutions. These findings offer insights into selecting appropriate algorithms based on mission priorities, balancing efficiency and computational feasibility.


Analysis and Optimization of Probabilities of Beneficial Mutation and Crossover Recombination in a Hamming Space

arXiv.org Artificial Intelligence

Inspired by Fisher's geometric approach to study beneficial mutations, we analyse probabilities of beneficial mutation and crossover recombination of strings in a general Hamming space with arbitrary finite alphabet. Mutations and recombinations that reduce the distance to an optimum are considered as beneficial. Geometric and combinatorial analysis is used to derive closed-form expressions for transition probabilities between spheres around an optimum giving a complete description of Markov evolution of distances from an optimum over multiple generations. This paves the way for optimization of parameters of mutation and recombination operators. Here we derive optimality conditions for mutation and recombination radii maximizing the probabilities of mutation and crossover into the optimum. The analysis highlights important differences between these evolutionary operators. While mutation can potentially reach any part of the search space, the probability of beneficial mutation decreases with distance to an optimum, and the optimal mutation radius or rate should also decrease resulting in a slow-down of evolution near the optimum. Crossover recombination, on the other hand, acts in a subspace of the search space defined by the current population of strings. However, probabilities of beneficial and deleterious crossover are balanced, and their characteristics, such as variance, are translation invariant in a Hamming space, suggesting that recombination may complement mutation and boost the rate of evolution near the optimum.


Solving the Job Shop Scheduling Problem with Graph Neural Networks: A Customizable Reinforcement Learning Environment

arXiv.org Artificial Intelligence

The job shop scheduling problem is an NP-hard combinatorial optimization problem relevant to manufacturing and timetabling. Traditional approaches use priority dispatching rules based on simple heuristics. Recent work has attempted to replace these with deep learning models, particularly graph neural networks (GNNs), that learn to assign priorities from data. However, training such models requires customizing numerous factors: graph representation, node features, action space, and reward functions. The lack of modular libraries for experimentation makes this research time-consuming. This work introduces JobShopLib, a modular library that allows customizing these factors and creating new components with its reinforcement learning environment. We trained several dispatchers through imitation learning to demonstrate the environment's utility. One model outperformed various graph-based dispatchers using only individual operation features, highlighting the importance of feature customization. Our GNN model achieved near state-of-the-art results on large-scale problems. These results suggest significant room for improvement in developing such models. JobShopLib provides the necessary tools for future experimentation.