AITopics | Dang, Yingnong

Collaborating Authors

Dang, Yingnong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Enabling Autonomic Microservice Management through Self-Learning Agents

Yu, Fenglin, Yang, Fangkai, Qin, Xiaoting, Zhang, Zhiyang, Zhang, Jue, Lin, Qingwei, Zhang, Hongyu, Dang, Yingnong, Rajmohan, Saravan, Zhang, Dongmei, Zhang, Qi

arXiv.org Artificial IntelligenceJan-31-2025

The increasing complexity of modern software systems necessitates robust autonomic self-management capabilities. While Large Language Models (LLMs) demonstrate potential in this domain, they often face challenges in adapting their general knowledge to specific service contexts. To address this limitation, we propose ServiceOdyssey, a self-learning agent system that autonomously manages microservices without requiring prior knowledge of service-specific configurations. By leveraging curriculum learning principles and iterative exploration, ServiceOdyssey progressively develops a deep understanding of operational environments, reducing dependence on human input or static documentation. A prototype built with the Sock Shop microservice demonstrates the potential of this approach for autonomic microservice management.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.19056

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Workflow (0.98)
Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Skeleton-Guided-Translation: A Benchmarking Framework for Code Repository Translation with Fine-Grained Quality Evaluation

Zhang, Xing, Wen, Jiaheng, Yang, Fangkai, Zhao, Pu, Kang, Yu, Wang, Junhao, Wang, Maoquan, Huang, Yufan, Nallipogu, Elsie, Lin, Qingwei, Dang, Yingnong, Rajmohan, Saravan, Zhang, Dongmei, Zhang, Qi

arXiv.org Artificial IntelligenceJan-27-2025

The advancement of large language models has intensified the need to modernize enterprise applications and migrate legacy systems to secure, versatile languages. However, existing code translation benchmarks primarily focus on individual functions, overlooking the complexities involved in translating entire repositories, such as maintaining inter-module coherence and managing dependencies. While some recent repository-level translation benchmarks attempt to address these challenges, they still face limitations, including poor maintainability and overly coarse evaluation granularity, which make them less developer-friendly. We introduce Skeleton-Guided-Translation, a framework for repository-level Java to C# code translation with fine-grained quality evaluation. It uses a two-step process: first translating the repository's structural "skeletons", then translating the full repository guided by these skeletons. Building on this, we present TRANSREPO-BENCH, a benchmark of high quality open-source Java repositories and their corresponding C# skeletons, including matching unit tests and build configurations. Our unit tests are fixed and can be applied across multiple or incremental translations without manual adjustments, enhancing automation and scalability in evaluations. Additionally, we develop fine-grained evaluation metrics that assess translation quality at the individual test case level, addressing traditional binary metrics' inability to distinguish when build failures cause all tests to fail. Evaluations using TRANSREPO-BENCH highlight key challenges and advance more accurate repository level code translation.

large language model, machine learning, translation, (20 more...)

arXiv.org Artificial Intelligence

2501.1605

Country: North America > Canada (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale

Zhang, Linghao, Wang, Junhao, He, Shilin, Zhang, Chaoyun, Kang, Yu, Li, Bowen, Wen, Jiaheng, Xie, Chengxing, Wang, Maoquan, Huang, Yufan, Nallipogu, Elsie, Lin, Qingwei, Dang, Yingnong, Rajmohan, Saravan, Zhang, Dongmei, Zhang, Qi

arXiv.org Artificial IntelligenceJan-23-2025

Large Language Models have advanced automated software development, however, it remains a challenge to correctly infer dependencies, namely, identifying the internal components and external packages required for a repository to successfully run. Existing studies highlight that dependency-related issues cause over 40\% of observed runtime errors on the generated repository. To address this, we introduce DI-BENCH, a large-scale benchmark and evaluation framework specifically designed to assess LLMs' capability on dependency inference. The benchmark features 581 repositories with testing environments across Python, C#, Rust, and JavaScript. Extensive experiments with textual and execution-based metrics reveal that the current best-performing model achieves only a 42.9% execution pass rate, indicating significant room for improvement. DI-BENCH establishes a new viewpoint for evaluating LLM performance on repositories, paving the way for more robust end-to-end software synthesis.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.13699

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

Why does Prediction Accuracy Decrease over Time? Uncertain Positive Learning for Cloud Failure Prediction

Li, Haozhe, Ma, Minghua, Liu, Yudong, Zhao, Pu, Zheng, Lingling, Li, Ze, Dang, Yingnong, Chintalapati, Murali, Rajmohan, Saravan, Lin, Qingwei, Zhang, Dongmei

arXiv.org Artificial IntelligenceJan-7-2024

With the rapid growth of cloud computing, a variety of software services have been deployed in the cloud. To ensure the reliability of cloud services, prior studies focus on failure instance (disk, node, and switch, etc.) prediction. Once the output of prediction is positive, mitigation actions are taken to rapidly resolve the underlying failure. According to our real-world practice in Microsoft Azure, we find that the prediction accuracy may decrease by about 9% after retraining the models. Considering that the mitigation actions may result in uncertain positive instances since they cannot be verified after mitigation, which may introduce more noise while updating the prediction model. To the best of our knowledge, we are the first to identify this Uncertain Positive Learning (UPLearning) issue in the real-world cloud failure prediction scenario. To tackle this problem, we design an Uncertain Positive Learning Risk Estimator (Uptake) approach. Using two real-world datasets of disk failure prediction and conducting node prediction experiments in Microsoft Azure, which is a top-tier cloud provider that serves millions of users, we demonstrate Uptake can significantly improve the failure prediction accuracy by 5% on average.

cloud computing, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2402.00034

Country:

North America > United States > Virginia (0.14)
Europe > Spain > Catalonia (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Xpert: Empowering Incident Management with Query Recommendations via Large Language Models

Jiang, Yuxuan, Zhang, Chaoyun, He, Shilin, Yang, Zhihao, Ma, Minghua, Qin, Si, Kang, Yu, Dang, Yingnong, Rajmohan, Saravan, Lin, Qingwei, Zhang, Dongmei

arXiv.org Artificial IntelligenceDec-19-2023

Large-scale cloud systems play a pivotal role in modern IT infrastructure. However, incidents occurring within these systems can lead to service disruptions and adversely affect user experience. To swiftly resolve such incidents, on-call engineers depend on crafting domain-specific language (DSL) queries to analyze telemetry data. However, writing these queries can be challenging and time-consuming. This paper presents a thorough empirical study on the utilization of queries of KQL, a DSL employed for incident management in a large-scale cloud management system at Microsoft. The findings obtained underscore the importance and viability of KQL queries recommendation to enhance incident management. Building upon these valuable insights, we introduce Xpert, an end-to-end machine learning framework that automates KQL recommendation process. By leveraging historical incident data and large language models, Xpert generates customized KQL queries tailored to new incidents. Furthermore, Xpert incorporates a novel performance metric called Xcore, enabling a thorough evaluation of query quality from three comprehensive perspectives. We conduct extensive evaluations of Xpert, demonstrating its effectiveness in offline settings. Notably, we deploy Xpert in the real production environment of a large-scale incident management system in Microsoft, validating its efficiency in supporting incident management. To the best of our knowledge, this paper represents the first empirical study of its kind, and Xpert stands as a pioneering DSL query recommendation framework designed for incident management.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2312.11988

Country:

Asia (0.68)
North America > United States > Michigan (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback