tardiness
Discovering Heuristics with Large Language Models (LLMs) for Mixed-Integer Programs: Single-Machine Scheduling
Çetinkaya, İbrahim Oğuz, Büyüktahtakın, İ. Esra, Shojaee, Parshin, Reddy, Chandan K.
Our study contributes to the scheduling and combinatorial optimization literature with new heuristics discovered by leveraging the power of Large Language Models (LLMs). We focus on the single-machine total tardiness (SMTT) problem, which aims to minimize total tardiness by sequencing n jobs on a single processor without preemption, given processing times and due dates. We develop and benchmark two novel LLM-discovered heuristics, the EDD Challenger (EDDC) and MDD Challenger (MDDC), inspired by the well-known Earliest Due Date (EDD) and Modified Due Date (MDD) rules. In contrast to prior studies that employed simpler rule-based heuristics, we evaluate our LLM-discovered algorithms using rigorous criteria, including optimality gaps and solution time derived from a mixed-integer programming (MIP) formulation of SMTT. We compare their performance against state-of-the-art heuristics and exact methods across various job sizes (20, 100, 200, and 500 jobs). For instances with more than 100 jobs, exact methods such as MIP and dynamic programming become computationally intractable. Up to 500 jobs, EDDC improves upon the classic EDD rule and another widely used algorithm in the literature. MDDC consistently outperforms traditional heuristics and remains competitive with exact approaches, particularly on larger and more complex instances. This study shows that human-LLM collaboration can produce scalable, high-performing heuristics for NP-hard constrained combinatorial optimization, even under limited resources when effectively configured.
Algorithms for dynamic scheduling in manufacturing, towards digital factories Improving Deadline Feasibility and Responsiveness via Temporal Networks
Modern manufacturing systems must meet hard delivery deadlines while coping with stochastic task durations caused by process noise, equipment variability, and human intervention. Traditional deterministic schedules break down when reality deviates from nominal plans, triggering costly last-minute repairs. This thesis combines offline constraint-programming (CP) optimisation with online temporal-network execution to create schedules that remain feasible under worst-case uncertainty. First, we build a CP model of the flexible job-shop with per-job deadline tasks and insert an optimal buffer $Δ^*$ to obtain a fully pro-active baseline. We then translate the resulting plan into a Simple Temporal Network with Uncertainty (STNU) and verify dynamic controllability, which guarantees that a real-time dispatcher can retime activities for every bounded duration realisation without violating resource or deadline constraints. Extensive Monte-Carlo simulations on the open Kacem~1--4 benchmark suite show that our hybrid approach eliminates 100\% of deadline violations observed in state-of-the-art meta-heuristic schedules, while adding only 3--5\% makespan overhead. Scalability experiments confirm that CP solve-times and STNU checks remain sub-second on medium-size instances. The work demonstrates how temporal-network reasoning can bridge the gap between proactive buffering and dynamic robustness, moving industry a step closer to truly digital, self-correcting factories.
EvoSpeak: Large Language Models for Interpretable Genetic Programming-Evolved Heuristics
Xu, Meng, Liu, Jiao, Ong, Yew Soon
Abstract--Genetic programming (GP) has demonstrated strong effectiveness in evolving tree-structured heuristics for complex optimization problems. Y et, in dynamic and large-scale scenarios, the most effective heuristics are often highly complex, hindering interpretability, slowing convergence, and limiting transferability across tasks. T o address these challenges, we present EvoSpeak, a novel framework that integrates GP with large language models (LLMs) to enhance the efficiency, transparency, and adaptability of heuristic evolution. EvoSpeak learns from high-quality GP heuristics, extracts knowledge, and leverages this knowledge to (i) generate warm-start populations that accelerate convergence, (ii) translate opaque GP trees into concise natural-language explanations that foster interpretability and trust, and (iii) enable knowledge transfer and preference-aware heuristic generation across related tasks. We verify the effectiveness of EvoSpeak through extensive experiments on dynamic flexible job shop scheduling (DFJSS), under both single-and multi-objective formulations. The results demonstrate that EvoSpeak produces more effective heuristics, improves evolutionary efficiency, and delivers human-readable reports that enhance usability. EURISTICS are indispensable tools for solving complex decision-making and optimization problems, with applications spanning scheduling [1], routing [2], and resource allocation [3]. They are designed to provide adaptive, domain-specific solutions that balance solution quality and computational efficiency, enabling practitioners to make near-optimal decisions in real time. Among the diverse methodologies for heuristic design, Genetic Programming (GP) [4] has emerged as a particularly powerful paradigm, capable of evolving interpretable symbolic rules that adapt to different problem structures [5]. GP-generated heuristics often rival, and sometimes surpass, learning-based methods such as neural combinatorial optimization [6], especially in terms of transparency and adaptability. Meng Xu is with the Singapore Institute of Manufacturing Technology, Agency for Science, Technology and Research, Singapore (e-mail: xu_meng@simtech.a-star.edu.sg). Jiao Liu is with the College of Computing & Data Science, Nanyang Technological University, Singapore (e-mail: jiao.liu@ntu.edu.sg). Y ew Soon Ong is with the College of Computing and Data Science, Nanyang Technological University, and the Centre for Frontier AI Research, Institute of High Performance Computing, Agency for Science, Technology and Research, Singapore (e-mail: asysong@ntu.edu.sg). Despite these advantages, the practical deployment of GPevolved heuristics faces two persistent challenges: complexity and transferability.
DyRo-MCTS: A Robust Monte Carlo Tree Search Approach to Dynamic Job Shop Scheduling
Chen, Ruiqi, Mei, Yi, Zhang, Fangfang, Zhang, Mengjie
Dynamic job shop scheduling, a fundamental combinatorial optimisation problem in various industrial sectors, poses substantial challenges for effective scheduling due to frequent disruptions caused by the arrival of new jobs. State-of-the-art methods employ machine learning to learn scheduling policies offline, enabling rapid responses to dynamic events. However, these offline policies are often imperfect, necessitating the use of planning techniques such as Monte Carlo Tree Search (MCTS) to improve performance at online decision time. The unpredictability of new job arrivals complicates online planning, as decisions based on incomplete problem information are vulnerable to disturbances. To address this issue, we propose the Dynamic Robust MCTS (DyRo-MCTS) approach, which integrates action robustness estimation into MCTS. DyRo-MCTS guides the production environment toward states that not only yield good scheduling outcomes but are also easily adaptable to future job arrivals. Extensive experiments show that DyRo-MCTS significantly improves the performance of offline-learned policies with negligible additional online planning time. Moreover, DyRo-MCTS consistently outperforms vanilla MCTS across various scheduling scenarios. Further analysis reveals that its ability to make robust scheduling decisions leads to long-term, sustainable performance gains under disturbances.
Scalability of Reinforcement Learning Methods for Dispatching in Semiconductor Frontend Fabs: A Comparison of Open-Source Models with Real Industry Datasets
Stöckermann, Patrick, Südfeld, Henning, Immordino, Alessandro, Altenmüller, Thomas, Wegmann, Marc, Gebser, Martin, Schekotihin, Konstantin, Seidel, Georg, Chan, Chew Wye, Zhang, Fei Fei
Benchmark datasets are crucial for evaluating approaches to scheduling or dispatching in the semiconductor industry during the development and deployment phases. However, commonly used benchmark datasets like the Minifab or SMT2020 lack the complex details and constraints found in real-world scenarios. To mitigate this shortcoming, we compare open-source simulation models with a real industry dataset to evaluate how optimization methods scale with different levels of complexity. Specifically, we focus on Reinforcement Learning methods, performing optimization based on policy-gradient and Evolution Strategies. Our research provides insights into the effectiveness of these optimization methods and their applicability to realistic semiconductor frontend fab simulations. We show that our proposed Evolution Strategies-based method scales much better than a comparable policy-gradient-based approach. Moreover, we identify the selection and combination of relevant bottleneck tools to control by the agent as crucial for an efficient optimization. For the generalization across different loading scenarios and stochastic tool failure patterns, we achieve advantages when utilizing a diverse training dataset. While the overall approach is computationally expensive, it manages to scale well with the number of CPU cores used for training. For the real industry dataset, we achieve an improvement of up to 4% regarding tardiness and up to 1% regarding throughput. For the less complex open-source models Minifab and SMT2020, we observe double-digit percentage improvement in tardiness and single digit percentage improvement in throughput by use of Evolution Strategies.
Bottleneck Identification in Resource-Constrained Project Scheduling via Constraint Relaxation
Nedbálek, Lukáš, Novák, Antonín
Keywords: scheduling, RCPSP, bottlenecks, constraint relaxation Abstract: In realistic production scenarios, Advanced Planning and Scheduling (APS) tools often require manual intervention by production planners, as the system works with incomplete information, resulting in suboptimal schedules. Often, the preferable solution is not found just because of the too-restrictive constraints specifying the optimization problem, representing bottlenecks in the schedule. To provide computer-assisted support for decision-making, we aim to automatically identify bottlenecks in the given schedule while linking them to the particular constraints to be relaxed. In this work, we address the problem of reducing the tardiness of a particular project in an obtained schedule in the resource-constrained project scheduling problem by relaxing constraints related to identified bottlenecks. We develop two methods for this purpose. The second method identifies potential improvements in relaxed versions of the problem and proposes targeted relaxations. Surprisingly, the untargeted relaxations result in improvements comparable to the targeted relaxations. 1 INTRODUCTION In the modern manufacturing industry, Advanced Planning and Scheduling (APS) tools are used to schedule production automatically. However, not all parameters and information are available to the APS systems in practice.
Multi-Agent Deep Q-Network with Layer-based Communication Channel for Autonomous Internal Logistics Vehicle Scheduling in Smart Manufacturing
Feizabadi, Mohammad, Hosseini, Arman, Yahouni, Zakaria
In smart manufacturing, scheduling autonomous internal logistic vehicles is crucial for optimizing operational efficiency. This paper proposes a multi-agent deep Q-network (MADQN) with a layer-based communication channel (LBCC) to address this challenge. The main goals are to minimize total job tardiness, reduce the number of tardy jobs, and lower vehicle energy consumption. The method is evaluated against nine well-known scheduling heuristics, demonstrating its effectiveness in handling dynamic job shop behaviors like job arrivals and workstation unavailabilities. The approach also proves scalable, maintaining performance across different layouts and larger problem instances, highlighting the robustness and adaptability of MADQN with LBCC in smart manufacturing.
Reinforcement Learning as an Improvement Heuristic for Real-World Production Scheduling
Müller, Arthur, Vollenkemper, Lukas
The integration of Reinforcement Learning (RL) with heuristic methods is an emerging trend for solving optimization problems, which leverages RL's ability to learn from the data generated during the search process. One promising approach is to train an RL agent as an improvement heuristic, starting with a suboptimal solution that is iteratively improved by applying small changes. We apply this approach to a real-world multiobjective production scheduling problem. Our approach utilizes a network architecture that includes Transformer encoding to learn the relationships between jobs. Afterwards, a probability matrix is generated from which pairs of jobs are sampled and then swapped to improve the solution. We benchmarked our approach against other heuristics using real data from our industry partner, demonstrating its superior performance.
Online Multi-Agent Pickup and Delivery with Task Deadlines
Managing delivery deadlines in automated warehouses and factories is crucial for maintaining customer satisfaction and ensuring seamless production. This study introduces the problem of online multi-agent pickup and delivery with task deadlines (MAPD-D), which is an advanced variant of the online MAPD problem incorporating delivery deadlines. MAPD-D presents a dynamic deadline-driven approach that includes task deadlines, with tasks being added at any time (online), thus challenging conventional MAPD frameworks. To tackle MAPD-D, we propose a novel algorithm named deadline-aware token passing (D-TP). The D-TP algorithm is designed to calculate pickup deadlines and assign tasks while balancing execution cost and deadline proximity. Additionally, we introduce the D-TP with task swaps (D-TPTS) method to further reduce task tardiness, enhancing flexibility and efficiency via task-swapping strategies. Numerical experiments were conducted in simulated warehouse environments to showcase the effectiveness of the proposed methods. Both D-TP and D-TPTS demonstrate significant reductions in task tardiness compared to existing methods, thereby contributing to efficient operations in automated warehouses and factories with delivery deadlines.