Goto

Collaborating Authors

 dgraph






DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection

Huang, Xuanwen, Yang, Yang, Wang, Yang, Wang, Chunping, Zhang, Zhisheng, Xu, Jiarong, Chen, Lei, Vazirgiannis, Michalis

arXiv.org Artificial Intelligence

Graph Anomaly Detection (GAD) has recently become a hot research spot due to its practicability and theoretical value. Since GAD emphasizes the application and the rarity of anomalous samples, enriching the varieties of its datasets is fundamental. Thus, this paper presents DGraph, a real-world dynamic graph in the finance domain. DGraph overcomes many limitations of current GAD datasets. It contains about 3M nodes, 4M dynamic edges, and 1M ground-truth nodes. We provide a comprehensive observation of DGraph, revealing that anomalous nodes and normal nodes generally have different structures, neighbor distribution, and temporal dynamics. Moreover, it suggests that 2M background nodes are also essential for detecting fraudsters. Furthermore, we conduct extensive experiments on DGraph. Observation and experiments demonstrate that DGraph is propulsive to advance GAD research and enable in-depth exploration of anomalous nodes.


Fraudulent User Detection Via Behavior Information Aggregation Network (BIAN) On Large-Scale Financial Social Network

Hu, Hanyi, Zhang, Long, Li, Shuan, Liu, Zhi, Yang, Yao, Na, Chongning

arXiv.org Artificial Intelligence

Financial frauds cause billions of losses annually and yet it lacks efficient approaches in detecting frauds considering user profile and their behaviors simultaneously in social network . A social network forms a graph structure whilst Graph neural networks (GNN), a promising research domain in Deep Learning, can seamlessly process non-Euclidean graph data . In financial fraud detection, the modus operandi of criminals can be identified by analyzing user profile and their behaviors such as transaction, loaning etc. as well as their social connectivity. Currently, most GNNs are incapable of selecting important neighbors since the neighbors' edge attributes (i.e., behaviors) are ignored. In this paper, we propose a novel behavior information aggregation network (BIAN) to combine the user behaviors with other user features. Different from its close "relatives" such as Graph Attention Networks (GAT) and Graph Transformer Networks (GTN), it aggregates neighbors based on neighboring edge attribute distribution, namely, user behaviors in financial social network. The experimental results on a real-world large-scale financial social network dataset, DGraph, show that BIAN obtains the 10.2% gain in AUROC comparing with the State-Of-The-Art models.


Auto-Model: Utilizing Research Papers and HPO Techniques to Deal with the CASH problem

Wang, Chunnan, Wang, Hongzhi, Mu, Tianyu, Li, Jianzhong, Gao, Hong

arXiv.org Artificial Intelligence

Auto-Model: Utilizing Research Papers and HPO Techniques to Deal with the CASH problem Chunnan Wang, Hongzhi Wang, Tianyu Mu, Jianzhong Li, Hong Gao Department of Computer Science Harbin Institute of T echnology Harbin, China {WangChunnan, wangzh, mutianyu, lijzh, honggao }@hit.edu.cn Abstract --In many fields, a mass of algorithms with completely different hyperparameters have been developed to address the same type of problems. Choosing the algorithm and hyperpa-rameter setting correctly can promote the overall performance greatly, but users often fail to do so due to the absence of knowledge. How to help users to effectively and quickly select the suitable algorithm and hyperparameter settings for the given task instance is an important research topic nowadays, which is known as the CASH problem. In this paper, we design the Auto-Model approach, which makes full use of known information in the related research paper and introduces hyperparameter optimization techniques, to solve the CASH problem effectively. Auto-Model tremendously reduces the cost of algorithm implementations and hyperparameter configuration space, and thus capable of dealing with the CASH problem efficiently and easily. T o demonstrate the benefit of Auto-Model, we compare it with classical Auto-Weka approach. The experimental results show that our proposed approach can provide superior results and achieves better performance in a short time. Index T erms--Algorithm selection, Hyperparameter optimization, Combined algorithm selection and hyperparameter optimization problem, Auto-Weka, Classification algorithms I. I NTRODUCTION In many fields, such as machine learning, data mining, artificial intelligence and constraint satisfaction, a variety of algorithms and heuristics have been developed to address the same type of problem [1], [2]. Each of these algorithms has its own advantages and disadvantages, and often they are complementary in the sense that one algorithm works well when others fail and vice versa [2]. If we are capable of selecting the algorithm and hyperparameter setting best suited to the task instance, any particular task instance will be well solved, and our ability of dealing with the problem will be improved considerably [3]. However, it is not trivial to achieve this goal. There are a mass of powerful and different algorithms to deal with a certain problem, and these algorithms have completely different hyperparameters, which have great effect on their performance. Even domain experts cannot easily and correctly select the appropriate algorithm with corresponding optimal hyperparameters from such a huge and complex choice space.