oce
Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides
An, Kaikai, Yang, Fangkai, Lu, Junting, Li, Liqun, Ren, Zhixing, Huang, Hao, Wang, Lu, Zhao, Pu, Kang, Yu, Ding, Hua, Lin, Qingwei, Rajmohan, Saravan, Zhang, Dongmei, Zhang, Qi
Effective incident management is pivotal for the smooth To investigate the effect of TSGs on incident mitigation, we analyze operation of Microsoft cloud services. In order to expedite incident around 1000 high-severity incidents in the recent twelve months mitigation, service teams gather troubleshooting knowledge into that demand immediate intervention from OCEs. Consistent with Troubleshooting Guides (TSGs) accessible to On-Call Engineers findings from prior studies [8, 18, 9], which demonstrate the efficacy (OCEs). While automated pipelines are enabled to resolve the most of TSGs in incident mitigation. We found that incidents paired with frequent and easy incidents, there still exist complex incidents that TSGs exhibit a 60% shorter average time-to-mitigate (TTM) compared require OCEs' intervention. In addition, TSGs are often unstructured to those without TSGs, emphasizing the pivotal role played and incomplete, which requires manual interpretation by OCEs, leading by TSGs. This trend is consistent across various companies, as evidenced to on-call fatigue and decreased productivity, especially among by research [14, 10], even among those employing different new-hire OCEs. In this work, we propose Nissist which leverages forms of TSGs. However, despite their utility, as highlighted by unstructured TSGs and incident mitigation history to provide proactive [18, 2], the unstructured format, varying quantity, and propensity for incident mitigation suggestions, reducing human intervention.
Exploring LLM-based Agents for Root Cause Analysis
Roy, Devjeet, Zhang, Xuchao, Bhave, Rashi, Bansal, Chetan, Las-Casas, Pedro, Fonseca, Rodrigo, Rajmohan, Saravan
The growing complexity of cloud based software systems has resulted in incident management becoming an integral part of the software development lifecycle. Root cause analysis (RCA), a critical part of the incident management process, is a demanding task for on-call engineers, requiring deep domain knowledge and extensive experience with a team's specific services. Automation of RCA can result in significant savings of time, and ease the burden of incident management on on-call engineers. Recently, researchers have utilized Large Language Models (LLMs) to perform RCA, and have demonstrated promising results. However, these approaches are not able to dynamically collect additional diagnostic information such as incident related logs, metrics or databases, severely restricting their ability to diagnose root causes. In this work, we explore the use of LLM based agents for RCA to address this limitation. We present a thorough empirical evaluation of a ReAct agent equipped with retrieval tools, on an out-of-distribution dataset of production incidents collected at Microsoft. Results show that ReAct performs competitively with strong retrieval and reasoning baselines, but with highly increased factual accuracy. We then extend this evaluation by incorporating discussions associated with incident reports as additional inputs for the models, which surprisingly does not yield significant performance improvements. Lastly, we conduct a case study with a team at Microsoft to equip the ReAct agent with tools that give it access to external diagnostic services that are used by the team for manual RCA. Our results show how agents can overcome the limitations of prior work, and practical considerations for implementing such a system in practice.
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Washington > King County > Redmond (0.04)
- (4 more...)
Unsupervised Learning of Object-Centric Embeddings for Cell Instance Segmentation in Microscopy Images
Wolf, Steffen, Lalit, Manan, Westmacott, Henry, McDole, Katie, Funke, Jan
Segmentation of objects in microscopy images is required for many biomedical applications. We introduce object-centric embeddings (OCEs), which embed image patches such that the spatial offsets between patches cropped from the same object are preserved. Those learnt embeddings can be used to delineate individual objects and thus obtain instance segmentations. Here, we show theoretically that, under assumptions commonly found in microscopy images, OCEs can be learnt through a self-supervised task that predicts the spatial offset between image patches. Together, this forms an unsupervised cell instance segmentation method which we evaluate on nine diverse large-scale microscopy datasets. Segmentations obtained with our method lead to substantially improved results, compared to state-of-the-art baselines on six out of nine datasets, and perform on par on the remaining three datasets. If ground-truth annotations are available, our method serves as an excellent starting point for supervised training, reducing the required amount of ground-truth needed by one order of magnitude, thus substantially increasing the practical applicability of our method. Source code is available at https://github.com/funkelab/cellulus.
Regret Bounds for Markov Decision Processes with Recursive Optimized Certainty Equivalents
Xu, Wenhao, Gao, Xuefeng, He, Xuedong
Reinforcement learning (RL) studies the problem of sequential decision making in an unknown environment by carefully balancing between exploration and exploitation (Sutton and Barto 2018). In the classical setting, it describes how an agent takes actions to maximize expected cumulative rewards in an environment typically modeled by a Markov decision process (MDP, Puterman (2014)). However, optimizing the expected cumulative rewards alone is often not sufficient in many practical applications such as finance, healthcare and robotics. Hence, it may be necessary to take into account of the risk preferences of the agent in the dynamic decision process. Indeed, a rich body of literature has studied risk-sensitive (and safe) RL, incorporating risk measures such as the entropic risk measure and conditional value-at-risk (CVaR) in the decision criterion, see, e.g., Shen et al. (2014), Garcıa and Fernández (2015), Tamar et al. (2016), Chow et al. (2017), Prashanth L and Fu (2018), Fei et al. (2020) and the references therein. In this paper we study risk-sensitive RL for tabular MDPs with unknown transition probabilities in the finite-horizon, episodic setting, where an agent interacts with the MDP in episodes of a fixed length with finite state and action spaces. To incorporate risk sensitivity, we consider a broad and important class of risk measures known as Optimized Certainty Equivalent (OCE, (Ben-Tal and Teboulle 1986, 2007)). The OCE is a (nonlinear) risk function which assigns a random variable X to a real value, and it depends on a concave utility function, see Equation (1) for the definition.
GitHub - Oloren-AI/olorenchemengine: OCE is the first infinitely composable library for reproducibly implementing SOTA molecular property prediction/QSAR techniques.
This abstraction system is provided free of charge by Oloren AI in the internals. In a fresh Python 3.8 environment, you can install the package with the following command: Feel free to check out install.sh to see what is happening under the hood. This will work fine in both a conda environment and a pip environment. The reason why a fresh environment is preferred is because PyTorch Geometric/ PyTorch/ CUDA are very particular about versioning, which often are more muddled in existing environment. Alternatively, you can also run OCE from one of our docker images.
Supervised Learning with General Risk Functionals
Leqi, Liu, Huang, Audrey, Lipton, Zachary C., Azizzadenesheli, Kamyar
Standard uniform convergence results bound the generalization gap of the expected loss over a hypothesis class. The emergence of risk-sensitive learning requires generalization guarantees for functionals of the loss distribution beyond the expectation. While prior works specialize in uniform convergence of particular functionals, our work provides uniform convergence for a general class of H\"older risk functionals for which the closeness in the Cumulative Distribution Function (CDF) entails closeness in risk. We establish the first uniform convergence results for estimating the CDF of the loss distribution, yielding guarantees that hold simultaneously both over all H\"older risk functionals and over all hypotheses. Thus licensed to perform empirical risk minimization, we develop practical gradient-based methods for minimizing distortion risks (widely studied subset of H\"older risks that subsumes the spectral risks, including the mean, conditional value at risk, cumulative prospect theory risks, and others) and provide convergence guarantees. In experiments, we demonstrate the efficacy of our learning procedure, both in settings where uniform convergence results hold and in high-dimensional settings with deep networks.
- North America > United States > Virginia (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- (3 more...)
Learning Bounds for Risk-sensitive Learning
Lee, Jaeho, Park, Sejun, Shin, Jinwoo
The systematic minimization of the quantifiable uncertainty, or risk [22], is one of the core objectives in all disciplines involving decision-making, e.g., economics and finance. Within machine learning contexts, strategies for risk-aversion have been most actively studied under sequential decision-making and reinforcement learning frameworks [21, 8], giving birth to a number of algorithms based on Markov decision processes (MDPs) and multi-armed bandits. In those works, various risk-averse measures of loss have been used as a minimization objective, instead of the risk-neutral expected loss; popular risk measures include entropic risk [21, 6, 7], mean-variance [39, 13, 28], and a slightly more modern alternative known as conditional value-at-risk (CVaR [15, 10, 42]). Yet, with growing interest to the societal impacts of machine intelligence, the importance of risk-aversion under non-sequential scenarios has also been spotlighted recently. For instance, Williamson and Menon [45] give an axiomatic characterization of the fairness risk measures, and propose a convex fairness-aware objective based on CVaR.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Actionable Interpretability through Optimizable Counterfactual Explanations for Tree Ensembles
Lucic, Ana, Oosterhuis, Harrie, Haned, Hinda, de Rijke, Maarten
Counterfactual explanations help users understand why machine learned models make certain decisions, and more specifically, how these decisions can be changed. In this work, we frame the problem of finding counterfactual explanations -- the minimal perturbation to an input such that the prediction changes -- as an optimization task. Previously, optimization techniques for generating counterfactual examples could only be applied to differentiable models, or alternatively via query access to the model by estimating gradients from randomly sampled perturbations. In order to accommodate non-differentiable models such as tree ensembles, we propose using probabilistic model approximations in the optimization framework. We introduce a novel approximation technique that is effective for finding counterfactual explanations while also closely approximating the original model. Our results show that our method is able to produce counterfactual examples that are closer to the original instance in terms of Euclidean, Cosine, and Manhattan distance compared to other methods specifically designed for tree ensembles.
- Europe > Netherlands > North Holland > Amsterdam (0.05)
- North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
- Asia > Middle East > Jordan (0.04)
- Information Technology > Security & Privacy (0.68)
- Law (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Estimation of Individualized Decision Rules Based on an Optimized Covariate-Dependent Equivalent of Random Outcomes
Qi, Zhengling, Cui, Ying, Liu, Yufeng, Pang, Jong-Shi
Recent exploration of optimal individualized decision rules (IDRs) for patients in precision medicine has attracted a lot of attention due to the heterogeneous responses of patients to different treatments. In the existing literature of precision medicine, an optimal IDR is defined as a decision function mapping from the patients' covariate space into the treatment space that maximizes the expected outcome of each individual. Motivated by the concept of Optimized Certainty Equivalent (OCE) introduced originally in \cite{ben1986expected} that includes the popular conditional-value-of risk (CVaR) \cite{rockafellar2000optimization}, we propose a decision-rule based optimized covariates dependent equivalent (CDE) for individualized decision making problems. Our proposed IDR-CDE broadens the existing expected-mean outcome framework in precision medicine and enriches the previous concept of the OCE. Numerical experiments demonstrate that our overall approach outperforms existing methods in estimating optimal IDRs under heavy-tail distributions of the data.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > North Carolina > Orange County > Chapel Hill (0.14)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.67)