Mining-Gym: A Configurable RL Benchmarking Environment for Truck Dispatch Scheduling

Banerjee, Chayan, Nguyen, Kien, Fookes, Clinton

arXiv.org Artificial Intelligence 

--Mining process optimization, particularly truck dispatch scheduling, is a critical factor in enhancing the efficiency of open-pit mining operations. However, the dynamic and stochastic nature of mining environments--characterized by uncertainties such as equipment failures, truck maintenance, and variable haul cycle times--poses significant challenges for traditional optimization methods. While Reinforcement Learning (RL) has demonstrated promise in adaptive decision-making for mining logistics, its practical deployment requires rigorous evaluation in realistic and customizable simulation environments. T o address this challenge, we introduce Mining-Gym, a configurable, open-source benchmarking environment designed for training, testing, and comparing RL algorithms in mining process optimization. Built on Discrete Event Simulation (DES) and seamlessly integrated with the OpenAI Gym interface, Mining-Gym offers a structured testbed that enables the direct application of advanced RL algorithms from Stable Baselines. The framework models key mining-specific uncertainties, such as equipment failures, queue congestion, and stochasticity of mining processes, ensuring a realistic and adaptive learning environment. Additionally, a graphic user interface (GUI) for easy parameter selection for mine-site configuration, comprehensive data logging system, a built-in KPI dashboard and real-time representative visualization of mine-site enables in-depth performance analysis, facilitating standardized, reproducible evaluation across multiple RL strategies and baseline heuristics. INING process optimization aims to enhance efficiency and productivity by improving resource allocation, equipment scheduling, and material handling. However, these operations are highly complex, influenced by dynamic factors such as equipment failures, fluctuating ore quality, and unpredictable environmental conditions. Traditional optimization methods, such as linear programming and heuristics, struggle to adapt in real time, leading to inefficiencies and increased costs.