Stack Trace-Based Crash Deduplication with Transformer Adaptation

Mamun, Md Afif Al, Uddin, Gias, Xia, Lan, Zhang, Longyu

arXiv.org Artificial Intelligence 

--Automated crash reporting systems generate large volumes of duplicate reports, overwhelming issue-tracking systems and increasing developer workload. Traditional stack trace-based deduplication methods--relying on string similarity, rule-based heuristics, or deep learning (DL) models--often fail to capture the contextual and structural relationships within stack traces. We propose dedupT, a transformer-based approach that models stack traces holistically rather than as isolated frames. Extensive experiments on real-world datasets show that dedupT outperforms existing DL and traditional methods (e.g., sequence alignment and information retrieval techniques) in both duplicate ranking and unique crash detection, significantly reducing manual triage effort. On four public datasets, dedupT improves Mean Reciprocal Rank (MRR) often by over 15% compared to the best DL baseline and up to 9% over traditional methods while achieving higher Receiver Operating Characteristic Area Under the Curve (ROC-AUC) in detecting unique crash reports. Our work advances the integration of modern natural language processing (NLP) techniques into software engineering, providing an effective solution for stack trace-based crash deduplication. Software issues are generally reported through (1) human-submitted reports and (2) automated crash reports. Human-reported issues typically include textual descriptions detailing the issue, expected and observed behavior, and may include attachments such as images or videos. In contrast, automated crash reports are generated by crash reporting tools (e.g., Sentry However, these automated systems often overwhelm ITS platforms by generating numerous duplicate crash reports for the same issue, requiring developers to manually review and triage them, which is a time-consuming process. For instance, Mozilla Firefox received 2.2 million issues in the first week of 2016, the majority being duplicates [1], while 72% of crash reports in the IntelliJ Platform were found to be duplicates [2]. In such scenarios, grouping similar crashes together is essential, a process known as crash deduplication . Unlike human-written reports with detailed descriptions, automated crash reports primarily contain technical data like stack traces and crash dumps. Figure 1: Example of a Java stack trace. Figure 1: Example of C++ stack trace.