Goto

Collaborating Authors

 Search


Resource Constrained Pathfinding with Enhanced Bidirectional A* Search

arXiv.org Artificial Intelligence

The classic Resource Constrained Shortest Path (RCSP) problem aims to find a cost optimal path between a pair of nodes in a network such that the resources used in the path are within a given limit. Having been studied for over a decade, RCSP has seen recent solutions that utilize heuristic-guided search to solve the constrained problem faster. Building upon the bidirectional A* search paradigm, this research introduces a novel constrained search framework that uses efficient pruning strategies to allow for accelerated and effective RCSP search in large-scale networks. Results show that, compared to the state of the art, our enhanced framework can significantly reduce the constrained search time, achieving speed-ups of over to two orders of magnitude.


Indirect Query Bayesian Optimization with Integrated Feedback

arXiv.org Artificial Intelligence

We develop the framework of Indirect Query Bayesian Optimization (IQBO), a new class of Bayesian optimization problems where the integrated feedback is given via a conditional expectation of the unknown function $f$ to be optimized. The underlying conditional distribution can be unknown and learned from data. The goal is to find the global optimum of $f$ by adaptively querying and observing in the space transformed by the conditional distribution. This is motivated by real-world applications where one cannot access direct feedback due to privacy, hardware or computational constraints. We propose the Conditional Max-Value Entropy Search (CMES) acquisition function to address this novel setting, and propose a hierarchical search algorithm to address the multi-resolution setting and improve the computational efficiency. We show regret bounds for our proposed methods and demonstrate the effectiveness of our approaches on simulated optimization tasks.


Threshold UCT: Cost-Constrained Monte Carlo Tree Search with Pareto Curves

arXiv.org Artificial Intelligence

Constrained Markov decision processes (CMDPs), in which the agent optimizes expected payoffs while keeping the expected cost below a given threshold, are the leading framework for safe sequential decision making under stochastic uncertainty. Among algorithms for planning and learning in CMDPs, methods based on Monte Carlo tree search (MCTS) have particular importance due to their efficiency and extendibility to more complex frameworks (such as partially observable settings and games). However, current MCTS-based methods for CMDPs either struggle with finding safe (i.e., constraint-satisfying) policies, or are too conservative and do not find valuable policies. We introduce Threshold UCT (T-UCT), an online MCTS-based algorithm for CMDP planning. Unlike previous MCTS-based CMDP planners, T-UCT explicitly estimates Pareto curves of cost-utility trade-offs throughout the search tree, using these together with a novel action selection and threshold update rules to seek safe and valuable policies. Our experiments demonstrate that our approach significantly outperforms state-of-the-art methods from the literature.


IDEQ: an improved diffusion model for the TSP

arXiv.org Artificial Intelligence

Recent years have seen a surge in machine learning models to solve combinatorial optimization (CO) problems. The field of combinatorial optimization is a historical field of research and application in computer science. After decades of progress, very efficient algorithms exist to provide exact solutions or approximate solutions to many CO problems. The Traveling Salesman Problem (TSP) stands as a prominent example of this fact: the TSP is very appealing as it is very simple to understand, it has a wide range of applications, and we are able to solve exactly rather large instances of this problem on a mere laptop (an instance defined over a few thousands cities can be solved within one hour), and we have approximate algorithms that are able to find tours that are very close to optimality (LKH3 [4] can solve instances of 40,000 cities in about one hour on a laptop but we have no guarantee that the result is optimal). These facts can not be forgotten when we try to propose alternate approaches to solve the TSP. Because of its appeal, the TSP has also drawn the attention of researchers in deep neural networks in the recent years. If the first attempts had difficulties solving even small TSP instances of a dozen cities, progress has been made. In this paper, we build on these previous works and go a step further.


Python Agent in Ludii

arXiv.org Artificial Intelligence

Ludii is a Java general game system with a considerable number of board games, with an API for developing new agents and a game description language to create new games. To improve versatility and ease development, we provide Python interfaces for agent programming. This allows the use of Python modules to implement general game playing agents. As a means of enabling Python for creating Ludii agents, the interfaces are implemented using different Java libraries: jpy and Py4J. The main goal of this work is to determine which version is faster. To do so, we conducted a performance analysis of two different GGP algorithms, Minimax adapted to GGP and MCTS. The analysis was performed across several combinatorial games with varying depth, branching factor, and ply time. For reproducibility, we provide tutorials and repositories. Our analysis includes predictive models using regression, which suggest that jpy is faster than Py4J, however slower than a native Java Ludii agent, as expected.


Neural Combinatorial Optimization for Stochastic Flexible Job Shop Scheduling Problems

arXiv.org Artificial Intelligence

Neural combinatorial optimization (NCO) has gained significant attention due to the potential of deep learning to efficiently solve combinatorial optimization problems. NCO has been widely applied to job shop scheduling problems (JSPs) with the current focus predominantly on deterministic problems. In this paper, we propose a novel attention-based scenario processing module (SPM) to extend NCO methods for solving stochastic JSPs. Our approach explicitly incorporates stochastic information by an attention mechanism that captures the embedding of sampled scenarios (i.e., an approximation of stochasticity). Fed with the embedding, the base neural network is intervened by the attended scenarios, which accordingly learns an effective policy under stochasticity. We also propose a training paradigm that works harmoniously with either the expected makespan or Value-at-Risk objective. Results demonstrate that our approach outperforms existing learning and non-learning methods for the flexible JSP problem with stochastic processing times on a variety of instances. In addition, our approach holds significant generalizability to varied numbers of scenarios and disparate distributions.


AI-Powered Algorithm-Centric Quantum Processor Topology Design

arXiv.org Artificial Intelligence

Quantum computing promises to revolutionize various fields, yet the execution of quantum programs necessitates an effective compilation process. This involves strategically mapping quantum circuits onto the physical qubits of a quantum processor. The qubits' arrangement, or topology, is pivotal to the circuit's performance, a factor that often defies traditional heuristic or manual optimization methods due to its complexity. In this study, we introduce a novel approach leveraging reinforcement learning to dynamically tailor qubit topologies to the unique specifications of individual quantum circuits, guiding algorithm-driven quantum processor topology design for reducing the depth of mapped circuit, which is particularly critical for the output accuracy on noisy quantum processors. Our method marks a significant departure from previous methods that have been constrained to mapping circuits onto a fixed processor topology. Experiments demonstrate that we have achieved notable enhancements in circuit performance, with a minimum of 20\% reduction in circuit depth in 60\% of the cases examined, and a maximum enhancement of up to 46\%. Furthermore, the pronounced benefits of our approach in reducing circuit depth become increasingly evident as the scale of the quantum circuits increases, exhibiting the scalability of our method in terms of problem size. This work advances the co-design of quantum processor architecture and algorithm mapping, offering a promising avenue for future research and development in the field.


Mastering AI: Big Data, Deep Learning, and the Evolution of Large Language Models -- AutoML from Basics to State-of-the-Art Techniques

arXiv.org Artificial Intelligence

In recent years, Artificial Intelligence (AI) and Machine Learning (ML) have grown tremendously in popularity across various industries. From healthcare and finance to retail and automotive, adopting machine learning models has led to significant advancements[1]. However, building machine learning models traditionally requires deep knowledge in multiple areas, such as data preprocessing, feature engineering, model selection, hyperparameter tuning, and evaluation[2]. For many beginners and even experienced practitioners, this process can be time-consuming and technically challenging. This is where AutoML (Automated Machine Learning) comes in. AutoML simplifies the process of building machine learning models by automating many of the steps that would otherwise require manual intervention [3]. AutoML tools can automatically preprocess data, select the most suitable algorithms, and fine-tune hyperparameters to produce highly accurate models [4]. This automation not only speeds up the model development cycle but also allows users without deep knowledge of machine learning to create models with comparable performance to those made by experienced data scientists.


Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

arXiv.org Artificial Intelligence

OpenAI o1 represents a significant milestone in Artificial Inteiligence, which achieves expert-level performances on many challanging tasks that require strong reasoning ability.OpenAI has claimed that the main techinique behinds o1 is the reinforcement learining. Recent works use alternative approaches like knowledge distillation to imitate o1's reasoning style, but their effectiveness is limited by the capability ceiling of the teacher model. Therefore, this paper analyzes the roadmap to achieving o1 from the perspective of reinforcement learning, focusing on four key components: policy initialization, reward design, search, and learning. Policy initialization enables models to develop human-like reasoning behaviors, equipping them with the ability to effectively explore solution spaces for complex problems. Reward design provides dense and effective signals via reward shaping or reward modeling, which is the guidance for both search and learning. Search plays a crucial role in generating high-quality solutions during both training and testing phases, which can produce better solutions with more computation. Learning utilizes the data generated by search for improving policy, which can achieve the better performance with more parameters and more searched data. Existing open-source projects that attempt to reproduce o1 can be seem as a part or a variant of our roadmap. Collectively, these components underscore how learning and search drive o1's advancement, making meaningful contributions to the development of LLM.


Multi-task Representation Learning for Mixed Integer Linear Programming

arXiv.org Artificial Intelligence

Mixed Integer Linear Programs (MILPs) are highly flexible and powerful tools for modeling and solving complex real-world combinatorial optimization problems. Recently, machine learning (ML)-guided approaches have demonstrated significant potential in improving MILPsolving efficiency. However, these methods typically rely on separate offline data collection and training processes, which limits their scalability and adaptability. This paper introduces the first multi-task learning framework for ML-guided MILP solving. The proposed framework provides MILP embeddings helpful in guiding MILP solving across solvers (e.g., Gurobi and SCIP) and across tasks (e.g., Branching and Solver configuration). Through extensive experiments on three widely used MILP benchmarks, we demonstrate that our multi-task learning model performs similarly to specialized models within the same distribution. Moreover, it significantly outperforms them in generalization across problem sizes and tasks. Keywords: Deep Learning Mixed Integer Linear Programming Multitask Learning Graph Neural Networks.