Goto

Collaborating Authors

 production system





Model Selection for Production System via Automated Online Experiments

Neural Information Processing Systems

A challenge that machine learning practitioners in the industry face is the task of selecting the best model to deploy in production. As a model is often an intermediate component of a production system, online controlled experiments such as A/B tests yield the most reliable estimation of the effectiveness of the whole system, but can only compare two or a few models due to budget constraints. We propose an automated online experimentation mechanism that can efficiently perform model selection from a large pool of models with a small number of online experiments. We derive the probability distribution of the metric of interest that contains the model uncertainty from our Bayesian surrogate model trained using historical logs. Our method efficiently identifies the best model by sequentially selecting and deploying a list of models from the candidate set that balance exploration-exploitation. Using simulations based on real data, we demonstrate the effectiveness of our method on two different tasks.


CORGI: Efficient Pattern Matching With Quadratic Guarantees

Weitekamp, Daniel

arXiv.org Artificial Intelligence

Rule-based systems must solve complex matching problems within tight time constraints to be effective in real-time applications, such as planning and reactive control for AI agents, as well as low-latency relational database querying. Pattern-matching systems can encounter issues where exponential time and space are required to find matches for rules with many underconstrained variables, or which produce combinatorial intermediate partial matches (but are otherwise well-constrained). When online AI systems automatically generate rules from example-driven induction or code synthesis, they can easily produce worst-case matching patterns that slow or halt program execution by exceeding available memory. In our own work with cognitive systems that learn from example, we've found that aggressive forms of anti-unification-based generalization can easily produce these circumstances. To make these systems practical without hand-engineering constraints or succumbing to unpredictable failure modes, we introduce a new matching algorithm called CORGI (Collection-Oriented Relational Graph Iteration). Unlike RETE-based approaches, CORGI offers quadratic time and space guarantees for finding single satisficing matches, and the ability to iteratively stream subsequent matches without committing entire conflict sets to memory. CORGI differs from RETE in that it does not have a traditional $β$-memory for collecting partial matches. Instead, CORGI takes a two-step approach: a graph of grounded relations is built/maintained in a forward pass, and an iterator generates matches as needed by working backward through the graph. This approach eliminates the high-latency delays and memory overflows that can result from populating full conflict sets. In a performance evaluation, we demonstrate that CORGI significantly outperforms RETE implementations from SOAR and OPS5 on a simple combinatorial matching task.





Outbound Modeling for Inventory Management

Savorgnan, Riccardo, Ghai, Udaya, Eisenach, Carson, Foster, Dean

arXiv.org Artificial Intelligence

We study the problem of forecasting the number of units fulfilled (or ``drained'') from each inventory warehouse to meet customer demand, along with the associated outbound shipping costs. The actual drain and shipping costs are determined by complex production systems that manage the planning and execution of customers' orders fulfillment, i.e. from where and how to ship a unit to be delivered to a customer. Accurately modeling these processes is critical for regional inventory planning, especially when using Reinforcement Learning (RL) to develop control policies. For the RL usecase, a drain model is incorporated into a simulator to produce long rollouts, which we desire to be differentiable. While simulating the calls to the internal software systems can be used to recover this transition, they are non-differentiable and too slow and costly to run within an RL training environment. Accordingly, we frame this as a probabilistic forecasting problem, modeling the joint distribution of outbound drain and shipping costs across all warehouses at each time period, conditioned on inventory positions and exogenous customer demand. To ensure robustness in an RL environment, the model must handle out-of-distribution scenarios that arise from off-policy trajectories. We propose a validation scheme that leverages production systems to evaluate the drain model on counterfactual inventory states induced by RL policies. Preliminary results demonstrate the model's accuracy within the in-distribution setting.


Maturity Framework for Enhancing Machine Learning Quality

Castelli, Angelantonio, Chouliaras, Georgios Christos, Goldenberg, Dmitri

arXiv.org Artificial Intelligence

With the rapid integration of Machine Learning (ML) in business applications and processes, it is crucial to ensure the quality, reliability and reproducibility of such systems. We suggest a methodical approach towards ML system quality assessment and introduce a structured Maturity framework for governance of ML. We emphasize the importance of quality in ML and the need for rigorous assessment, driven by issues in ML governance and gaps in existing frameworks. Our primary contribution is a comprehensive open-sourced quality assessment method, validated with empirical evidence, accompanied by a systematic maturity framework tailored to ML systems. Drawing from applied experience at Booking.com, we discuss challenges and lessons learned during large-scale adoption within organizations. The study presents empirical findings, highlighting quality improvement trends and showcasing business outcomes. The maturity framework for ML systems, aims to become a valuable resource to reshape industry standards and enable a structural approach to improve ML maturity in any organization.