Wang, Tingting
Weighted Graph Structure Learning with Attention Denoising for Node Classification
Wang, Tingting, Su, Jiaxin, Liu, Haobing, Jiang, Ruobing
--The node classification in graphs aims to predict the categories of unlabeled nodes utilizing a small set of labeled nodes. However, weighted graphs often contain noisy edges and anomalous edge weights, which can distort fine-grained relationships between nodes and hinder accurate classification. We propose the Edge Weight-aware Graph Structure Learning (EWGSL) method, which combines weight learning and graph structure learning to address these issues. EWGSL improves node classification by redefining attention coefficients in graph attention networks to incorporate node features and edge weights. It also applies graph structure learning to sparsify attention coefficients and uses a modified InfoNCE loss function to enhance performance by adapting to denoised graph weights. Extensive experimental results show that EWGSL has an average Micro-F1 improvement of 17.8 % compared to the best baseline.
Incorporating Higher-order Structural Information for Graph Clustering
Li, Qiankun, Liu, Haobing, Jiang, Ruobing, Wang, Tingting
Clustering holds profound significance in data mining. In recent years, graph convolutional network (GCN) has emerged as a powerful tool for deep clustering, integrating both graph structural information and node attributes. However, most existing methods ignore the higher-order structural information of the graph. Evidently, nodes within the same cluster can establish distant connections. Besides, recent deep clustering methods usually apply a self-supervised module to monitor the training process of their model, focusing solely on node attributes without paying attention to graph structure. In this paper, we propose a novel graph clustering network to make full use of graph structural information. To capture the higher-order structural information, we design a graph mutual infomax module, effectively maximizing mutual information between graph-level and node-level representations, and design a trinary self-supervised module that includes modularity as a structural constraint. Our proposed model outperforms many state-of-the-art methods on various datasets, demonstrating its superiority.
KGroot: Enhancing Root Cause Analysis through Knowledge Graphs and Graph Convolutional Neural Networks
Wang, Tingting, Qi, Guilin, Wu, Tianxing
Fault localization is challenging in online micro-service due to the wide variety of monitoring data volume, types, events and complex interdependencies in service and components. Faults events in services are propagative and can trigger a cascade of alerts in a short period of time. In the industry, fault localization is typically conducted manually by experienced personnel. This reliance on experience is unreliable and lacks automation. Different modules present information barriers during manual localization, making it difficult to quickly align during urgent faults. This inefficiency lags stability assurance to minimize fault detection and repair time. Though actionable methods aimed to automatic the process, the accuracy and efficiency are less than satisfactory. The precision of fault localization results is of paramount importance as it underpins engineers trust in the diagnostic conclusions, which are derived from multiple perspectives and offer comprehensive insights. Therefore, a more reliable method is required to automatically identify the associative relationships among fault events and propagation path. To achieve this, KGroot uses event knowledge and the correlation between events to perform root cause reasoning by integrating knowledge graphs and GCNs for RCA. FEKG is built based on historical data, an online graph is constructed in real-time when a failure event occurs, and the similarity between each knowledge graph and online graph is compared using GCNs to pinpoint the fault type through a ranking strategy. Comprehensive experiments demonstrate KGroot can locate the root cause with accuracy of 93.5% top 3 potential causes in second-level. This performance matches the level of real-time fault diagnosis in the industrial environment and significantly surpasses state-of-the-art baselines in RCA in terms of effectiveness and efficiency.
A survey on natural language processing (nlp) and applications in insurance
Ly, Antoine, Uthayasooriyar, Benno, Wang, Tingting
Text is the most widely used means of communication today. This data is abundant but nevertheless complex to exploit within algorithms. For years, scientists have been trying to implement different techniques that enable computers to replicate some mechanisms of human reading. During the past five years, research disrupted the capacity of the algorithms to unleash the value of text data. It brings today, many opportunities for the insurance industry.Understanding those methods and, above all, knowing how to apply them is a major challenge and key to unleash the value of text data that have been stored for many years. Processing language with computer brings many new opportunities especially in the insurance sector where reports are central in the information used by insurers. SCOR's Data Analytics team has been working on the implementation of innovative tools or products that enable the use of the latest research on text analysis. Understanding text mining techniques in insurance enhances the monitoring of the underwritten risks and many processes that finally benefit policyholders.This article proposes to explain opportunities that Natural Language Processing (NLP) are providing to insurance. It details different methods used today in practice traces back the story of them. We also illustrate the implementation of certain methods using open source libraries and python codes that we have developed to facilitate the use of these techniques.After giving a general overview on the evolution of text mining during the past few years,we share about how to conduct a full study with text mining and share some examples to serve those models into insurance products or services. Finally, we explained in more details every step that composes a Natural Language Processing study to ensure the reader can have a deep understanding on the implementation.