Goto

Collaborating Authors

 Yan, Meng


CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph Searching

arXiv.org Artificial Intelligence

Large language models (LLMs) have significantly advanced autonomous software engineering, leading to a growing number of software engineering agents that assist developers in automatic program repair. Issue localization forms the basis for accurate patch generation. However, because of limitations caused by the context window length of LLMs, existing issue localization methods face challenges in balancing concise yet effective contexts and adequately comprehensive search spaces. In this paper, we introduce CoSIL, an LLM driven, simple yet powerful function level issue localization method without training or indexing. CoSIL reduces the search space through module call graphs, iteratively searches the function call graph to obtain relevant contexts, and uses context pruning to control the search direction and manage contexts effectively. Importantly, the call graph is dynamically constructed by the LLM during search, eliminating the need for pre-parsing. Experiment results demonstrate that CoSIL achieves a Top-1 localization success rate of 43 percent and 44.6 percent on SWE bench Lite and SWE bench Verified, respectively, using Qwen2.5 Coder 32B, outperforming existing methods by 8.6 to 98.2 percent. When CoSIL is applied to guide the patch generation stage, the resolved rate further improves by 9.3 to 31.5 percent.


Gaussian-based Probabilistic Deep Supervision Network for Noise-Resistant QoS Prediction

arXiv.org Artificial Intelligence

Quality of Service (QoS) prediction is an essential task in recommendation systems, where accurately predicting unknown QoS values can improve user satisfaction. However, existing QoS prediction techniques may perform poorly in the presence of noise data, such as fake location information or virtual gateways. In this paper, we propose the Probabilistic Deep Supervision Network (PDS-Net), a novel framework for QoS prediction that addresses this issue. PDS-Net utilizes a Gaussian-based probabilistic space to supervise intermediate layers and learns probability spaces for both known features and true labels. Moreover, PDS-Net employs a condition-based multitasking loss function to identify objects with noise data and applies supervision directly to deep features sampled from the probability space by optimizing the Kullback-Leibler distance between the probability space of these objects and the real-label probability space. Thus, PDS-Net effectively reduces errors resulting from the propagation of corrupted data, leading to more accurate QoS predictions. Experimental evaluations on two real-world QoS datasets demonstrate that the proposed PDS-Net outperforms state-of-the-art baselines, validating the effectiveness of our approach.


A Dual Latent State Learning Approach: Exploiting Regional Network Similarities for QoS Prediction

arXiv.org Artificial Intelligence

Individual objects, whether users or services, within a specific region often exhibit similar network states due to their shared origin from the same city or autonomous system (AS). Despite this regional network similarity, many existing techniques overlook its potential, resulting in subpar performance arising from challenges such as data sparsity and label imbalance. In this paper, we introduce the regional-based dual latent state learning network(R2SL), a novel deep learning framework designed to overcome the pitfalls of traditional individual object-based prediction techniques in Quality of Service (QoS) prediction. Unlike its predecessors, R2SL captures the nuances of regional network behavior by deriving two distinct regional network latent states: the city-network latent state and the AS-network latent state. These states are constructed utilizing aggregated data from common regions rather than individual object data. Furthermore, R2SL adopts an enhanced Huber loss function that adjusts its linear loss component, providing a remedy for prevalent label imbalance issues. To cap off the prediction process, a multi-scale perception network is leveraged to interpret the integrated feature map, a fusion of regional network latent features and other pertinent information, ultimately accomplishing the QoS prediction. Through rigorous testing on real-world QoS datasets, R2SL demonstrates superior performance compared to prevailing state-of-the-art methods. Our R2SL approach ushers in an innovative avenue for precise QoS predictions by fully harnessing the regional network similarities inherent in objects.


Plot2API: Recommending Graphic API from Plot via Semantic Parsing Guided Neural Network

arXiv.org Artificial Intelligence

Plot-based Graphic API recommendation (Plot2API) is an unstudied but meaningful issue, which has several important applications in the context of software engineering and data visualization, such as the plotting guidance of the beginner, graphic API correlation analysis, and code conversion for plotting. Plot2API is a very challenging task, since each plot is often associated with multiple APIs and the appearances of the graphics drawn by the same API can be extremely varied due to the different settings of the parameters. Additionally, the samples of different APIs also suffer from extremely imbalanced. Considering the lack of technologies in Plot2API, we present a novel deep multi-task learning approach named Semantic Parsing Guided Neural Network (SPGNN) which translates the Plot2API issue as a multi-label image classification and an image semantic parsing tasks for the solution. In SPGNN, the recently advanced Convolutional Neural Network (CNN) named EfficientNet is employed as the backbone network for API recommendation. Meanwhile, a semantic parsing module is complemented to exploit the semantic relevant visual information in feature learning and eliminate the appearance-relevant visual information which may confuse the visual-information-based API recommendation. Moreover, the recent data augmentation technique named random erasing is also applied for alleviating the imbalance of API categories. We collect plots with the graphic APIs used to drawn them from Stack Overflow, and release three new Plot2API datasets corresponding to the graphic APIs of R and Python programming languages for evaluating the effectiveness of Plot2API techniques. Extensive experimental results not only demonstrate the superiority of our method over the recent deep learning baselines but also show the practicability of our method in the recommendation of graphic APIs.