tsg
Agentic Troubleshooting Guide Automation for Incident Management
Mao, Jiayi, Li, Liqun, Gao, Yanjie, Peng, Zegang, He, Shilin, Zhang, Chaoyun, Qin, Si, Khalid, Samia, Lin, Qingwei, Rajmohan, Saravan, Lanka, Sitaram, Zhang, Dongmei
Effective incident management in large-scale IT systems relies on troubleshooting guides (TSGs), but their manual execution is slow and error-prone. While recent advances in LLMs offer promise for automating incident management tasks, existing LLM-based solutions lack specialized support for several key challenges, including managing TSG quality issues, interpreting complex control flow, handling data-intensive queries, and exploiting execution parallelism. We first conducted an empirical study on 92 real-world TSGs, and, guided by our findings, we present StepFly, a novel end-to-end agentic framework for troubleshooting guide automation. Our approach features a three-stage workflow: the first stage provides a comprehensive guide together with a tool, TSG Mentor, to assist SREs in improving TSG quality; the second stage performs offline preprocessing using LLMs to extract structured execution DAGs from unstructured TSGs and to create dedicated Query Preparation Plugins (QPPs); and the third stage executes online using a DAG-guided scheduler-executor framework with a memory system to guarantee correct workflow and support parallel execution of independent steps. Our empirical evaluation on a collection of real-world TSGs and incidents demonstrates that StepFly achieves a ~94% success rate on GPT-4.1, outperforming baselines with less time and token consumption. Furthermore, it achieves a remarkable execution time reduction of 32.9% to 70.4% for parallelizable TSGs.
- Asia > China (0.77)
- North America > United States > New York > New York County > New York City (0.05)
- Workflow (1.00)
- Research Report > New Finding (1.00)
Automated Root Cause Analysis System for Complex Data Products
Demarne, Mathieu, Cilimdzic, Miso, Falkowski, Tom, Johnson, Timothy, Gramling, Jim, Kuang, Wei, Hou, Hoobie, Aryan, Amjad, Subramaniam, Gayatri, Lee, Kenny, Mejia, Manuel, Liu, Lisa, Vermareddy, Divya
We present ARCAS (Automated Root Cause Analysis System), a diagnostic platform based on a Domain Specific Language (DSL) built for fast diagnostic implementation and low learning curve. Arcas is composed of a constellation of automated troubleshooting guides (Auto-TSGs) that can execute in parallel to detect issues using product telemetry and apply mitigation in near-real-time. The DSL is tailored specifically to ensure that subject matter experts can deliver highly curated and relevant Auto-TSGs in a short time without having to understand how they will interact with the rest of the diagnostic platform, thus reducing time-to-mitigate and saving crucial engineering cycles when they matter most. This contrasts with platforms like Datadog and New Relic, which primarily focus on monitoring and require manual intervention for mitigation. ARCAS uses a Large Language Model (LLM) to prioritize Auto-TSGs outputs and take appropriate actions, thus suppressing the costly requirement of understanding the general behavior of the system. We explain the key concepts behind ARCAS and demonstrate how it has been successfully used for multiple products across Azure Synapse Analytics and Microsoft Fabric Synapse Data Warehouse.
CIKAN: Constraint Informed Kolmogorov-Arnold Networks for Autonomous Spacecraft Rendezvous using Time Shift Governor
Kim, Taehyeun, Girard, Anouck, Kolmanovsky, Ilya
The paper considers a Constrained-Informed Neural Network (CINN) approximation for the Time Shift Governor (TSG), which is an add-on scheme to the nominal closed-loop system used to enforce constraints by time-shifting the reference trajectory in spacecraft rendezvous applications. We incorporate Kolmogorov-Arnold Networks (KANs), an emerging architecture in the AI community, as a fundamental component of CINN and propose a Constrained-Informed Kolmogorov-Arnold Network (CIKAN)-based approximation for TSG. We demonstrate the effectiveness of the CIKAN-based TSG through simulations of constrained spacecraft rendezvous missions on highly elliptic orbits and present comparisons between CIKANs, MLP-based CINNs, and the conventional TSG.
Time Step Generating: A Universal Synthesized Deepfake Image Detector
Zeng, Ziyue, Liu, Haoyuan, Peng, Dingjie, Jing, Luoxu, Watanabe, Hiroshi
Currently, high-fidelity text-to-image models are developed in an accelerating pace. Among them, Diffusion Models have led to a remarkable improvement in the quality of image generation, making it vary challenging to distinguish between real and synthesized images. It simultaneously raises serious concerns regarding privacy and security. Some methods are proposed to distinguish the diffusion model generated images through reconstructing. However, the inversion and denoising processes are time-consuming and heavily reliant on the pre-trained generative model. Consequently, if the pre-trained generative model meet the problem of out-of-domain, the detection performance declines. To address this issue, we propose a universal synthetic image detector Time Step Generating (TSG), which does not rely on pre-trained models' reconstructing ability, specific datasets, or sampling algorithms. Our method utilizes a pre-trained diffusion model's network as a feature extractor to capture fine-grained details, focusing on the subtle differences between real and synthetic images. By controlling the time step t of the network input, we can effectively extract these distinguishing detail features. Then, those features can be passed through a classifier (i.e. Resnet), which efficiently detects whether an image is synthetic or real. We test the proposed TSG on the large-scale GenImage benchmark and it achieves significant improvements in both accuracy and generalizability.
Time Shift Governor for Constrained Control of Spacecraft Orbit and Attitude Relative Motion in Bicircular Restricted Four-Body Problem
Kim, Taehyeun, Kolmanovsky, Ilya, Girard, Anouck
This paper considers constrained spacecraft rendezvous and docking (RVD) in the setting of the Bicircular Restricted Four-Body Problem (BCR4BP), while accounting for attitude dynamics. We consider Line of Sight (LoS) cone constraints, thrust limits, thrust direction limits, and approach velocity constraints during RVD missions in a near rectilinear halo orbit (NRHO) in the Sun-Earth-Moon system. To enforce the constraints, the Time Shift Governor (TSG), which uses a time-shifted Chief spacecraft trajectory as a target reference for the Deputy spacecraft, is employed. The time shift is gradually reduced to zero so that the virtual target gradually evolves towards the Chief spacecraft as time goes by, and the RVD mission objective can be achieved. Numerical simulation results are reported to validate the proposed control method.
- South America > Peru > Cusco Department > Cusco Province > Cusco (0.04)
- North America > United States > Texas > Bexar County > San Antonio (0.04)
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Sadat, Seyedmorteza, Kansy, Manuel, Hilliges, Otmar, Weber, Romann M.
Classifier-free guidance (CFG) has become the standard method for enhancing the quality of conditional diffusion models. However, employing CFG requires either training an unconditional model alongside the main diffusion model or modifying the training procedure by periodically inserting a null condition. There is also no clear extension of CFG to unconditional models. In this paper, we revisit the core principles of CFG and introduce a new method, independent condition guidance (ICG), which provides the benefits of CFG without the need for any special training procedures. Our approach streamlines the training process of conditional diffusion models and can also be applied during inference on any pre-trained conditional model. Additionally, by leveraging the time-step information encoded in all diffusion networks, we propose an extension of CFG, called time-step guidance (TSG), which can be applied to any diffusion model, including unconditional ones. Our guidance techniques are easy to implement and have the same sampling cost as CFG. Through extensive experiments, we demonstrate that ICG matches the performance of standard CFG across various conditional diffusion models. Moreover, we show that TSG improves generation quality in a manner similar to CFG, without relying on any conditional information.
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Austria (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (5 more...)
Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment
Fei, Hao, Wu, Shengqiong, Zhang, Meishan, Zhang, Min, Chua, Tat-Seng, Yan, Shuicheng
While pre-training large-scale video-language models (VLMs) has shown remarkable potential for various downstream video-language tasks, existing VLMs can still suffer from certain commonly seen limitations, e.g., coarse-grained cross-modal aligning , under-modeling of temporal dynamics, detached video-language view. In this work, we target enhancing VLMs with a fine-grained structural spatio-temporal alignment learning method (namely Finsta). First of all, we represent the input texts and videos with fine-grained scene graph (SG) structures, both of which are further unified into a holistic SG (HSG) for bridging two modalities. Then, an SG-based framework is built, where the textual SG (TSG) is encoded with a graph Transformer, while the video dynamic SG (DSG) and the HSG are modeled with a novel recurrent graph Transformer for spatial and temporal feature propagation. A spatial-temporal Gaussian differential graph Transformer is further devised to strengthen the sense of the changes in objects across spatial and temporal dimensions. Next, based on the fine-grained structural features of TSG and DSG, we perform object-centered spatial alignment and predicate-centered temporal alignment respectively, enhancing the video-language grounding in both the spatiality and temporality. We design our method as a plug&play system, which can be integrated into existing well-trained VLMs for further representation augmentation, without training from scratch or relying on SG annotations in downstream applications. On 6 representative VL modeling tasks over 12 datasets in both standard and long-form video scenarios, Finsta consistently improves the existing 13 strong-performing VLMs persistently, and refreshes the current state-of-the-art end task performance significantly in both the fine-tuning and zero-shot settings.
- Asia > Singapore > Central Region > Singapore (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides
An, Kaikai, Yang, Fangkai, Lu, Junting, Li, Liqun, Ren, Zhixing, Huang, Hao, Wang, Lu, Zhao, Pu, Kang, Yu, Ding, Hua, Lin, Qingwei, Rajmohan, Saravan, Zhang, Dongmei, Zhang, Qi
Effective incident management is pivotal for the smooth To investigate the effect of TSGs on incident mitigation, we analyze operation of Microsoft cloud services. In order to expedite incident around 1000 high-severity incidents in the recent twelve months mitigation, service teams gather troubleshooting knowledge into that demand immediate intervention from OCEs. Consistent with Troubleshooting Guides (TSGs) accessible to On-Call Engineers findings from prior studies [8, 18, 9], which demonstrate the efficacy (OCEs). While automated pipelines are enabled to resolve the most of TSGs in incident mitigation. We found that incidents paired with frequent and easy incidents, there still exist complex incidents that TSGs exhibit a 60% shorter average time-to-mitigate (TTM) compared require OCEs' intervention. In addition, TSGs are often unstructured to those without TSGs, emphasizing the pivotal role played and incomplete, which requires manual interpretation by OCEs, leading by TSGs. This trend is consistent across various companies, as evidenced to on-call fatigue and decreased productivity, especially among by research [14, 10], even among those employing different new-hire OCEs. In this work, we propose Nissist which leverages forms of TSGs. However, despite their utility, as highlighted by unstructured TSGs and incident mitigation history to provide proactive [18, 2], the unstructured format, varying quantity, and propensity for incident mitigation suggestions, reducing human intervention.
Grounding-Prompter: Prompting LLM with Multimodal Information for Temporal Sentence Grounding in Long Videos
Chen, Houlun, Wang, Xin, Chen, Hong, Song, Zihan, Jia, Jia, Zhu, Wenwu
Temporal Sentence Grounding (TSG), which aims to localize moments from videos based on the given natural language queries, has attracted widespread attention. Existing works are mainly designed for short videos, failing to handle TSG in long videos, which poses two challenges: i) complicated contexts in long videos require temporal reasoning over longer moment sequences, and ii) multiple modalities including textual speech with rich information require special designs for content understanding in long videos. To tackle these challenges, in this work we propose a Grounding-Prompter method, which is capable of conducting TSG in long videos through prompting LLM with multimodal information. In detail, we first transform the TSG task and its multimodal inputs including speech and visual, into compressed task textualization. Furthermore, to enhance temporal reasoning under complicated contexts, a Boundary-Perceptive Prompting strategy is proposed, which contains three folds: i) we design a novel Multiscale Denoising Chain-of-Thought (CoT) to combine global and local semantics with noise filtering step by step, ii) we set up validity principles capable of constraining LLM to generate reasonable predictions following specific formats, and iii) we introduce one-shot In-Context-Learning (ICL) to boost reasoning through imitation, enhancing LLM in TSG task understanding. Experiments demonstrate the state-of-the-art performance of our Grounding-Prompter method, revealing the benefits of prompting LLM with multimodal information for TSG in long videos.
System-Initiated Transitions from Chit-Chat to Task-Oriented Dialogues with Transition Info Extractor and Transition Sentence Generator
Liu, Ye, Ultes, Stefan, Minker, Wolfgang, Maier, Wolfgang
In this work, we study dialogue scenarios that start from chit-chat but eventually switch to task-related services, and investigate how a unified dialogue model, which can engage in both chit-chat and task-oriented dialogues, takes the initiative during the dialogue mode transition from chit-chat to task-oriented in a coherent and cooperative manner. We firstly build a {transition info extractor} (TIE) that keeps track of the preceding chit-chat interaction and detects the potential user intention to switch to a task-oriented service. Meanwhile, in the unified model, a {transition sentence generator} (TSG) is extended through efficient Adapter tuning and transition prompt learning. When the TIE successfully finds task-related information from the preceding chit-chat, such as a transition domain, then the TSG is activated automatically in the unified model to initiate this transition by generating a transition sentence under the guidance of transition information extracted by TIE. The experimental results show promising performance regarding the proactive transitions. We achieve an additional large improvement on TIE model by utilizing Conditional Random Fields (CRF). The TSG can flexibly generate transition sentences while maintaining the unified capabilities of normal chit-chat and task-oriented response generation.
- Europe > Germany (0.04)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)