Goto

Collaborating Authors

 downtime


A Control-Theoretic Approach to Dynamic Payment Routing for Success Rate Optimization

Agrawal, Aniket, Patil, Harsharanga

arXiv.org Artificial Intelligence

This paper introduces a control-theoretic framework for dynamic payment routing, implemented within JUSPAY's Payment Orchestrator to maximize transaction success rate. The routing system is modeled as a closed-loop feedback controller continuously sensing gateway [3] performance, computing corrective actions, and dynamically routes transactions across gateway to ensure operational resilience. The system leverages concepts from control theory, reinforcement learning, and multi-armed bandit optimization to achieve both short-term responsiveness and long-term stability. Rather than relying on explicit PID regulation, the framework applies generalized feedback-based adaptation, ensuring that corrective actions remain proportional to observed performance deviations and the computed gateway score gradually converges toward the success rate [2]. This hybrid approach unifies control theory and adaptive decision systems, enabling self-regulating transaction routing that dampens instability, and improves reliability. Live production results show an improvement of up to 1.15% in success rate over traditional rule-based routing, demonstrating the effectiveness of feedback-based control in payment systems.


Energy firms snap up weather services for trading edge in Japan

The Japan Times

Power traders are fueling a boom in weather data, which helps them to anticipate sudden price swings. Weather forecasters are finding a lucrative niche in Japan's power-trading boom, selling hyper-specialized data to firms seeking an edge in one of the world's most volatile electricity markets. Weathernews is among a handful of companies cashing in on demand for meteorological data. The Tokyo-listed company's shares have surged 50% in the last year as investors bet on its expanded use of artificial intelligence, among other factors. The firm says it's supplying -- or is in talks to provide -- data to several dozen power traders, about a third of which are based outside Japan.


R-ConstraintBench: Evaluating LLMs on NP-Complete Scheduling

Jain, Raj, Wetter, Marc

arXiv.org Artificial Intelligence

However, the reliability of large language models (LLMs) when reasoning under high-constraint regimes is insufficiently characterized. To address this gap, we present R-ConstraintBench, a scalable framework that evaluates models on Resource-Constrained Project Scheduling Problems (RCPSP), an NP-Complete feasibility class, while difficulty increases via linear growth in constraints. R-ConstraintBench incrementally increases non-redundant precedence constraints in Directed Acyclic Graphs (DAGs) and then introduces downtime, temporal windows, and disjunctive constraints. As an illustrative example, we instantiate the benchmark in a data center migration setting and evaluate multiple LLMs using feasibility and error analysis, identifying degradation thresholds and constraint types most associated with failure. Empirically, strong models are near-ceiling on precedence-only DAGs, but feasibility performance collapses when downtime, temporal windows, and disjunctive constraints interact--implicating constraint interaction, not graph depth, as the principal bottleneck. Performance on clean synthetic ramps also does not guarantee transfer to domain-grounded scenarios, underscoring limited generalization.


Robot-run store VenHub is changing the future of shopping

FOX News

You walk up to a robot-run convenience store, place your order on an app, and robotic arms quickly grab your items and deliver them to a secure window, all without any human employees. That's exactly what's happening at VenHub, a fully autonomous, AI-powered smart store that just opened at the LAX/Metro Transit Center in Los Angeles. Even if you haven't seen one in person yet, VenHub's cutting-edge tech is set to shake up how people shop all across the country. Sign up for my FREE CyberGuy Report Get my best tech tips, urgent security alerts, and exclusive deals delivered straight to your inbox. Plus, you'll get instant access to my Ultimate Scam Survival Guide -- free when you join.


TrainMover: Efficient ML Training Live Migration with No Memory Overhead

Lao, ChonLam, Yu, Minlan, Akella, Aditya, Cao, Jiamin, Guan, Yu, Zhang, Pengcheng, Zheng, Zhilong, Xu, Yichi, Zhai, Ennan, Cai, Dennis, Gao, Jiaqi

arXiv.org Artificial Intelligence

Machine learning training has emerged as one of the most prominent workloads in modern data centers. These training jobs are large-scale, long-lasting, and tightly coupled, and are often disrupted by various events in the cluster such as failures, maintenance, and job scheduling. To handle these events, we rely on cold migration, where we first checkpoint the entire cluster, replace the related machines, and then restart the training. This approach leads to disruptions to the training jobs, resulting in significant downtime. In this paper, we present TrainMover, a live migration system that enables machine replacement during machine learning training. TrainMover minimizes downtime by leveraging member replacement of collective communication groups and sandbox lazy initialization. Our evaluation demonstrates that TrainMover achieves 16x less downtime compared to all baselines, effectively handling data center events like straggler rebalancing, maintenance, and unexpected failures.


Llumnix: Dynamic Scheduling for Large Language Model Serving

Sun, Biao, Huang, Ziming, Zhao, Hanyu, Xiao, Wencong, Zhang, Xinyi, Li, Yong, Lin, Wei

arXiv.org Artificial Intelligence

Inference serving for large language models (LLMs) is the key to unleashing their potential in people's daily lives. However, efficient LLM serving remains challenging today because the requests are inherently heterogeneous and unpredictable in terms of resource and latency requirements, as a result of the diverse applications and the dynamic execution nature of LLMs. Existing systems are fundamentally limited in handling these characteristics and cause problems such as severe queuing delays, poor tail latencies, and SLO violations. We introduce Llumnix, an LLM serving system that reacts to such heterogeneous and unpredictable requests by runtime rescheduling across multiple model instances. Similar to context switching across CPU cores in modern operating systems, Llumnix reschedules requests to improve load balancing and isolation, mitigate resource fragmentation, and differentiate request priorities and SLOs. Llumnix implements the rescheduling with an efficient and scalable live migration mechanism for requests and their in-memory states, and exploits it in a dynamic scheduling policy that unifies the multiple rescheduling scenarios elegantly. Our evaluations show that Llumnix improves tail latencies by an order of magnitude, accelerates high-priority requests by up to 1.5x, and delivers up to 36% cost savings while achieving similar tail latencies, compared against state-of-the-art LLM serving systems. Llumnix is publicly available at https://github.com/AlibabaPAI/llumnix.


Diagnostic Digital Twin for Anomaly Detection in Floating Offshore Wind Energy

Stadtmann, Florian, Rasheed, Adil

arXiv.org Artificial Intelligence

The demand for condition-based and predictive maintenance is rising across industries, especially for remote, high-value, and high-risk assets. In this article, the diagnostic digital twin concept is introduced, discussed, and implemented for a floating offshore turbine. A diagnostic digital twin is a virtual representation of an asset that combines real-time data and models to monitor damage, detect anomalies, and diagnose failures, thereby enabling condition-based and predictive maintenance. By applying diagnostic digital twins to offshore assets, unexpected failures can be alleviated, but the implementation can prove challenging. Here, a diagnostic digital twin is implemented for an operational floating offshore wind turbine. The asset is monitored through measurements. Unsupervised learning methods are employed to build a normal operation model, detect anomalies, and provide a fault diagnosis. Warnings and diagnoses are sent through text messages, and a more detailed diagnosis can be accessed in a virtual reality interface. The diagnostic digital twin successfully detected an anomaly with high confidence hours before a failure occurred. The paper concludes by discussing diagnostic digital twins in the broader context of offshore engineering. The presented approach can be generalized to other offshore assets to improve maintenance and increase the lifetime, efficiency, and sustainability of offshore assets.


Automated Anomaly Detection on European XFEL Klystrons

Sulc, Antonin, Eichler, Annika, Wilksen, Tim

arXiv.org Artificial Intelligence

High-power multi-beam klystrons represent a key component to amplify RF to generate the accelerating field of the superconducting radio frequency (SRF) cavities at European XFEL. Exchanging these high-power components takes time and effort, thus it is necessary to minimize maintenance and downtime and at the same time maximize the device's operation. In an attempt to explore the behavior of klystrons using machine learning, we completed a series of experiments on our klystrons to determine various operational modes and conduct feature extraction and dimensionality reduction to extract the most valuable information about a normal operation. To analyze recorded data we used state-of-the-art data-driven learning techniques and recognized the most promising components that might help us better understand klystron operational states and identify early on possible faults or anomalies.


Beverly Hill plastic surgeon says striking actors using downtime to get new faces

FOX News

Dr. Ben Talei, who was recently publicly thanked by Sia for her facelift, told Fox News Digital he did'a ton' of facelifts during the height of the actors' strike. During the strike, actors have found themselves with a lot of downtime. In addition to picketing, another popular option, according to Dr. Ben Talei, is getting a cosmetic refresh. The Beverly Hills-based plastic surgeon, who was recently praised by "Chandelier" singer Sia for giving her an "amazing" facelift, explained the mini plastic-surgery boom he has seen in his office. "Before the strike, as rumors were kind of going around that a strike was going to start… I began getting consults and I started getting lots of text messages from friends and friends of friends in Hollywood," Talei told Fox News Digital.


How Artificial Intelligence Is Revolutionizing the Packaging Industry? - The Data Scientist

#artificialintelligence

Artificial Intelligence is shaping how businesses work and enhancing their capacity to thrive smartly. In recent years we have seen many awe-inspiring developments and super useful too. AI is working in almost every industry, such as food, cosmetics, wood, medicine, etc.; we know that every business requires packaging for their products, which defines the value of the packaging manufacturing industry. Keeping this in mind, AI is playing an impressive role in the advancement of the packaging industry too. Artificial intelligence is transforming the way the packaging industry is working.