Energy
A Machine Learning-Based Study on the Synergistic Optimization of Supply Chain Management and Financial Supply Chains from an Economic Perspective
Wang, Hang, Tang, Huijie, Leng, Ningai, Yu, Zhoufan
Abstract: Based on economic theories and integrated with machine learning technology, this study explores the collaborative model of Supply Chain Management - Financial Supply Chain Management (SCM - FSCM), aiming to solve supply chain issues (efficiency l oss, financing constraints, risk transmission) caused by the disconnection of the "three flows" (capital flow, logistics flow, information flow) and further improve overall economic benefits. Firstly, the study combines Transaction Cost Theory and Information Asymmetry Theory, adopts algorithms such as random forests to process multi - dimensional supply chain data, identifies obstacles to the collaboration of the "three flows", and constructs a data - driven three - dimensional (cost - efficiency - risk) a nalysis framework. Secondly, it designs a Financial Supply Chain Management model of "core enterprise credit empowerment + dynamic pledge financing". Based on inventory/order data in Supply Chain Management, it applies Long Short - Term Memory (LSTM) netwo rks to realize demand forecasting, and at the same time uses clustering/regression algorithms to quantify benefit distribution, so as to achieve reasonable allocation of financing costs. In addition, the study also combines Game Theory and reinforcement learning to optimize the supply chain inventory - procurement mechanism (adjusts strategies through scenario simulation to solve problems caused by the "bullwhip effect"); and integrates accounts receivable financing in Financial Supply Chain Management with credit assessment based on eXtreme Gradient Boosting (XGBoost) to realize rapid monetization of inventory.
AutoGrid AI: Deep Reinforcement Learning Framework for Autonomous Microgrid Management
Guo, Kenny, Eckhert, Nicholas, Chhajer, Krish, Abeykoon, Luthira, Schell, Lorne
--We present a deep reinforcement learning-based framework for autonomous microgrid management. Using deep reinforcement learning and time-series forecasting models, we optimize microgrid energy dispatch strategies to minimize costs and maximize the utilization of renewable energy sources such as solar and wind. Our approach integrates the transformer architecture for forecasting of renewable generation and a proximal-policy optimization (PPO) agent to make decisions in a simulated environment. Our experimental results demonstrate significant improvements in both energy efficiency and operational resilience when compared to traditional rule-based methods. This work contributes to advancing smart-grid technologies in pursuit of zero-carbon energy systems. We finally provide an open-source framework for simulating several microgrid environments.
treeX: Unsupervised Tree Instance Segmentation in Dense Forest Point Clouds
Burmeister, Josafat-Mattias, Tockner, Andreas, Reder, Stefan, Engel, Markus, Richter, Rico, Mund, Jan-Peter, Dรถllner, Jรผrgen
Close-range laser scanning provides detailed 3D captures of forest stands but requires efficient software for processing 3D point cloud data and extracting individual trees. Although recent studies have introduced deep learning methods for tree instance segmentation, these approaches require large annotated datasets and substantial computational resources. As a resource-efficient alternative, we present a revised version of the treeX algorithm, an unsupervised method that combines clustering-based stem detection with region growing for crown delineation. While the original treeX algorithm was developed for personal laser scanning (PLS) data, we provide two parameter presets, one for ground-based laser scanning (stationary terrestrial - TLS and PLS), and one for UAV-borne laser scanning (ULS). We evaluated the method on six public datasets (FOR-instance, ForestSemantic, LAUTx, NIBIO MLS, TreeLearn, Wytham Woods) and compared it to six open-source methods (original treeX, treeiso, RayCloudTools, ForAINet, SegmentAnyTree, TreeLearn). Compared to the original treeX algorithm, our revision reduces runtime and improves accuracy, with instance detection F$_1$-score gains of +0.11 to +0.49 for ground-based data. For ULS data, our preset achieves an F$_1$-score of 0.58, whereas the original algorithm fails to segment any correct instances. For TLS and PLS data, our algorithm achieves accuracy similar to recent open-source methods, including deep learning. Given its algorithmic design, we see two main applications for our method: (1) as a resource-efficient alternative to deep learning approaches in scenarios where the data characteristics align with the method design (sufficient stem visibility and point density), and (2) for the semi-automatic generation of labels for deep learning models. To enable broader adoption, we provide an open-source Python implementation in the pointtree package.
Invariant Features for Global Crop Type Classification
Accurately obtaining crop type and its spatial distribution at a global scale is critical for food security, agricultural policy-making, and sustainable development. Remote sensing offers an efficient solution for large-scale crop classification, but the limited availability of reliable ground samples in many regions constrains applicability across geographic areas. To address performance declines under geospatial shifts, this study identifies remote sensing features that are invariant to geographic variation and proposes strategies to enhance cross-regional generalization. We construct CropGlobe, a global crop type dataset with 300,000 pixel-level samples from eight countries across five continents, covering six major food and industrial crops (corn, soybeans, rice, wheat, sugarcane, cotton). With broad geographic coverage, CropGlobe enables a systematic evaluation under cross-country, cross-continent, and cross-hemisphere transfer. We compare the transferability of temporal multi-spectral features (Sentinel-2-based 1D/2D median features and harmonic coefficients) and hyperspectral features (from EMIT). To improve generalization under spectral and phenological shifts, we design CropNet, a lightweight and robust CNN tailored for pixel-level crop classification, coupled with temporal data augmentation (time shift, time scale, and magnitude warping) that simulates realistic cross-regional phenology. Experiments show that 2D median temporal features from Sentinel-2 consistently exhibit the strongest invariance across all transfer scenarios, and augmentation further improves robustness, particularly when training data diversity is limited. Overall, the work identifies more invariant feature representations that enhance geographic transferability and suggests a promising path toward scalable, low-cost crop type applications across globally diverse regions.
Beacon: Post-Training Quantization with Integrated Grid Selection
Quantization is a widely used compression technique for reducing the memory and computation costs of large pre-trained models. A key challenge in per-channel post-training quantization (PTQ) is selecting appropriate scaling factors to replace weight values with values from a scaled integer grid. Existing methods typically fix the scale at the outset via heuristic tuning or grid search. We propose Beacon, a simple and effective algorithm that eliminates the need for such manual tuning. Beacon performs per-channel PTQ directly using an unscaled grid and automatically determines the optimal scaling factors by exploiting the geometry of scalar quantization. It does not rely on back-propagation or large calibration sets. Despite its simplicity and tuning-free nature, Beacon achieves competitive performance compared to state-of-the-art methods, making it a practical solution for efficient model deployment.
Vectorized Attention with Learnable Encoding for Quantum Transformer
Guo, Ziqing, Pan, Ziwen, Khan, Alex, Balewski, Jan
Vectorized quantum block encoding provides a way to embed classical data into Hilbert space, offering a pathway for quantum models, such as Quantum Transformers (QT), that replace classical self-attention with quantum circuit simulations to operate more efficiently. Current QTs rely on deep parameterized quantum circuits (PQCs), rendering them vulnerable to QPU noise, and thus hindering their practical performance. In this paper, we propose the Vectorized Quantum Transformer (VQT), a model that supports ideal masked attention matrix computation through quantum approximation simulation and efficient training via vectorized nonlinear quantum encoder, yielding shot-efficient and gradient-free quantum circuit simulation (QCS) and reduced classical sampling overhead. In addition, we demonstrate an accuracy comparison for IBM and IonQ in quantum circuit simulation and competitive results in benchmarking natural language processing tasks on IBM state-of-the-art and high-fidelity Kingston QPU. Our noise intermediate-scale quantum friendly VQT approach unlocks a novel architecture for end-to-end machine learning in quantum computing.
Recursive Reward Aggregation
Tang, Yuting, Zhang, Yivan, Ackermann, Johannes, Zhang, Yu-Jie, Nishimori, Soichiro, Sugiyama, Masashi
In reinforcement learning (RL), aligning agent behavior with specific objectives typically requires careful design of the reward function, which can be challenging when the desired objectives are complex. In this work, we propose an alternative approach for flexible behavior alignment that eliminates the need to modify the reward function by selecting appropriate reward aggregation functions. By introducing an algebraic perspective on Markov decision processes (MDPs), we show that the Bellman equations naturally emerge from the recursive generation and aggregation of rewards, allowing for the generalization of the standard discounted sum to other recursive aggregations, such as discounted max and Sharpe ratio. Our approach applies to both deterministic and stochastic settings and integrates seamlessly with value-based and actor-critic algorithms. Experimental results demonstrate that our approach effectively optimizes diverse objectives, highlighting its versatility and potential for real-world applications.
ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes
Chen, Zeyuan, Yan, Qiyang, Chen, Yuanpei, Wu, Tianhao, Zhang, Jiyao, Ding, Zihan, Li, Jinzhou, Yang, Yaodong, Dong, Hao
Dexterous grasping in cluttered scenes presents significant challenges due to diverse object geometries, occlusions, and potential collisions. Existing methods primarily focus on single-object grasping or grasp-pose prediction without interaction, which are insufficient for complex, cluttered scenes. Recent vision-language-action models offer a potential solution but require extensive real-world demonstrations, making them costly and difficult to scale. To address these limitations, we revisit the sim-to-real transfer pipeline and develop key techniques that enable zero-shot deployment in reality while maintaining robust generalization. We propose ClutterDexGrasp, a two-stage teacher-student framework for closed-loop target-oriented dexterous grasping in cluttered scenes. The framework features a teacher policy trained in simulation using clutter density curriculum learning, incorporating both a geometry and spatially-embedded scene representation and a novel comprehensive safety curriculum, enabling general, dynamic, and safe grasping behaviors. Through imitation learning, we distill the teacher's knowledge into a student 3D diffusion policy (DP3) that operates on partial point cloud observations. To the best of our knowledge, this represents the first zero-shot sim-to-real closed-loop system for target-oriented dexterous grasping in cluttered scenes, demonstrating robust performance across diverse objects and layouts. More details and videos are available at https://clutterdexgrasp.github.io/.
MiniCPM4: Ultra-Efficient LLMs on End Devices
MiniCPM Team, null, Xiao, Chaojun, Li, Yuxuan, Han, Xu, Bai, Yuzhuo, Cai, Jie, Chen, Haotian, Chen, Wentong, Cong, Xin, Cui, Ganqu, Ding, Ning, Fan, Shengda, Fang, Yewei, Fu, Zixuan, Guan, Wenyu, Guan, Yitong, Guo, Junshao, Han, Yufeng, He, Bingxiang, Huang, Yuxiang, Ji, Baoxi, Kong, Cunliang, Li, Qiuzuo, Li, Siyuan, Li, Wenhao, Li, Xin, Li, Yanghao, Li, Yishan, Li, Zhen, Liu, Dan, Lin, Biyuan, Lin, Yankai, Long, Xiang, Lu, Quanyu, Lu, Yaxi, Luo, Peiyan, Lyu, Hongya, Ou, Litu, Pan, Yinxu, Pu, Lushi, Qu, Zekai, Shi, Qundong, Song, Zijun, Su, Jiayuan, Su, Zhou, Sun, Ao, Sun, Xianghui, Tang, Peijun, Wang, Fangzheng, Wang, Feng, Wang, Shuo, Wang, Yudong, Wang, Zheng, Wu, Yesai, Xiao, Zhenyu, Xie, Jie, Xie, Zihao, Xu, Xiaoyue, Yan, Yukun, Yuan, Jiarui, Zhang, Jinqian, Zhang, Kaihuo, Zhang, Lei, Zhang, Linyue, Zhang, Xueren, Zhang, Yudi, Zhao, Hengyu, Zhao, Weilin, Zhao, Weilun, Zhao, Yuanqian, Zheng, Zhi, Zhou, Chuyue, Zhou, Ge, Zhou, Jie, Zhou, Wei, Zhou, Yanghao, Zhou, Zihan, Zhou, Zixuan, Liu, Zhiyuan, Zeng, Guoyang, Jia, Chao, Li, Dahai, Sun, Maosong
This paper introduces MiniCPM4, a highly efficient large language model (LLM) designed explicitly for end-side devices. We achieve this efficiency through systematic innovation in four key dimensions: model architecture, training data, training algorithms, and inference systems. Specifically, in terms of model architecture, we propose InfLLM v2, a trainable sparse attention mechanism that accelerates both prefilling and decoding phases for long-context processing. Regarding training data, we propose UltraClean, an efficient and accurate pre-training data filtering and generation strategy, and UltraChat v2, a comprehensive supervised fine-tuning dataset. These datasets enable satisfactory model performance to be achieved using just 8 trillion training tokens. Regarding training algorithms, we propose ModelTunnel v2 for efficient pre-training strategy search, and improve existing post-training methods by introducing chunk-wise rollout for load-balanced reinforcement learning and data-efficient tenary LLM, BitCPM. Regarding inference systems, we propose CPM.cu that integrates sparse attention, model quantization, and speculative sampling to achieve efficient prefilling and decoding. To meet diverse on-device requirements, MiniCPM4 is available in two versions, with 0.5B and 8B parameters, respectively. Furthermore, we construct a hybrid reasoning model, MiniCPM4.1, which can be used in both deep reasoning mode and non-reasoning mode. Evaluation results demonstrate that MiniCPM4 and MiniCPM4.1 outperform similar-sized open-source models across benchmarks, with the 8B variants showing significant speed improvements on long sequence understanding and generation.
Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling
Jadhav, Prachi, Jin, Hongwei, Deelman, Ewa, Balaprakash, Prasanna
High-Performance Computing (HPC) job scheduling involves balancing conflicting objectives such as minimizing makespan, reducing wait times, optimizing resource use, and ensuring fairness. Traditional methods, including heuristic-based, e.g., First-Come-First-Served (FJFS) and Shortest Job First (SJF), or intensive optimization techniques, often lack adaptability to dynamic workloads and, more importantly, cannot simultaneously optimize multiple objectives in HPC systems. To address this, we propose a novel Large Language Model (LLM)-based scheduler using a ReAct-style framework (Reason + Act), enabling iterative, interpretable decision-making. The system incorporates a scratchpad memory to track scheduling history and refine decisions via natural language feedback, while a constraint enforcement module ensures feasibility and safety. We evaluate our approach using OpenAI's O4-Mini and Anthropic's Claude 3.7 across seven real-world HPC workload scenarios, including heterogeneous mixes, bursty patterns, and adversarial cases etc. Comparisons against FCFS, SJF, and Google OR-Tools (on 10 to 100 jobs) reveal that LLM-based scheduling effectively balances multiple objectives while offering transparent reasoning through natural language traces. The method excels in constraint satisfaction and adapts to diverse workloads without domain-specific training. However, a trade-off between reasoning quality and computational overhead challenges real-time deployment. This work presents the first comprehensive study of reasoning-capable LLMs for HPC scheduling, demonstrating their potential to handle multiobjective optimization while highlighting limitations in computational efficiency. The findings provide insights into leveraging advanced language models for complex scheduling problems in dynamic HPC environments.