ego network
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > United States > California > Alameda County > Berkeley (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
Deal: Distributed End-to-End GNN Inference for All Nodes
Chen, Shiyang, Song, Xiang, Theodore, Vasiloudis, Liu, Hang
Graph Neural Networks (GNNs) are a new research frontier with various applications and successes. The end-to-end inference for all nodes, is common for GNN embedding models, which are widely adopted in applications like recommendation and advertising. While sharing opportunities arise in GNN tasks (i.e., inference for a few nodes and training), the potential for sharing in full graph end-to-end inference is largely underutilized because traditional efforts fail to fully extract sharing benefits due to overwhelming overheads or excessive memory usage. This paper introduces Deal, a distributed GNN inference system that is dedicated to end-to-end inference for all nodes for graphs with multi-billion edges. First, we unveil and exploit an untapped sharing opportunity during sampling, and maximize the benefits from sharing during subsequent GNN computation. Second, we introduce memory-saving and communication-efficient distributed primitives for lightweight 1-D graph and feature tensor collaborative partitioning-based distributed inference. Third, we introduce partitioned, pipelined communication and fusing feature preparation with the first GNN primitive for end-to-end inference. With Deal, the end-to-end inference time on real-world benchmark datasets is reduced up to 7.70 x and the graph construction time is reduced up to 21.05 x, compared to the state-of-the-art.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > Germany > Bavaria > Regensburg (0.04)
- Africa > Zambia > Southern Province > Choma (0.04)
- Research Report (0.64)
- Workflow (0.46)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
\textsc{Perseus}: Tracing the Masterminds Behind Cryptocurrency Pump-and-Dump Schemes
Fu, Honglin, Feng, Yebo, Wu, Cong, Xu, Jiahua
Masterminds are entities organizing, coordinating, and orchestrating cryptocurrency pump-and-dump schemes, a form of trade-based manipulation undermining market integrity and causing financial losses for unwitting investors. Previous research detects pump-and-dump activities in the market, predicts the target cryptocurrency, and examines investors and \ac{osn} entities. However, these solutions do not address the root cause of the problem. There is a critical gap in identifying and tracing the masterminds involved in these schemes. In this research, we develop a detection system \textsc{Perseus}, which collects real-time data from the \acs{osn} and cryptocurrency markets. \textsc{Perseus} then constructs temporal attributed graphs that preserve the direction of information diffusion and the structure of the community while leveraging \ac{gnn} to identify the masterminds behind pump-and-dump activities. Our design of \textsc{Perseus} leads to higher F1 scores and precision than the \ac{sota} fraud detection method, achieving fast training and inferring speeds. Deployed in the real world from February 16 to October 9 2024, \textsc{Perseus} successfully detects $438$ masterminds who are efficient in the pump-and-dump information diffusion networks. \textsc{Perseus} provides regulators with an explanation of the risks of masterminds and oversight capabilities to mitigate the pump-and-dump schemes of cryptocurrency.
- Europe (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
Learning Exposure Mapping Functions for Inferring Heterogeneous Peer Effects
Adhikari, Shishir, Medya, Sourav, Zheleva, Elena
In causal inference, interference refers to the phenomenon in which the actions of peers in a network can influence an individual's outcome. Peer effect refers to the difference in counterfactual outcomes of an individual for different levels of peer exposure, the extent to which an individual is exposed to the treatments, actions, or behaviors of peers. Estimating peer effects requires deciding how to represent peer exposure. Typically, researchers define an exposure mapping function that aggregates peer treatments and outputs peer exposure. Most existing approaches for defining exposure mapping functions assume peer exposure based on the number or fraction of treated peers. Recent studies have investigated more complex functions of peer exposure which capture that different peers can exert different degrees of influence. However, none of these works have explicitly considered the problem of automatically learning the exposure mapping function. In this work, we focus on learning this function for the purpose of estimating heterogeneous peer effects, where heterogeneity refers to the variation in counterfactual outcomes for the same peer exposure but different individual's contexts. We develop EgoNetGNN, a graph neural network (GNN)-based method, to automatically learn the appropriate exposure mapping function allowing for complex peer influence mechanisms that, in addition to peer treatments, can involve the local neighborhood structure and edge attributes. We show that GNN models that use peer exposure based on the number or fraction of treated peers or learn peer exposure naively face difficulty accounting for such influence mechanisms. Our comprehensive evaluation on synthetic and semi-synthetic network data shows that our method is more robust to different unknown underlying influence mechanisms when estimating heterogeneous peer effects when compared to state-of-the-art baselines.
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Jordan (0.04)
Preordering: A hybrid of correlation clustering and partial ordering
Irmai, Jannik, Moeller, Maximilian, Andres, Bjoern
Motivation stems from the problem of estimating a binary relation on the accoun ts of a social network where the notions that one account follows, likes or reacts to another account need neither be symmetric nor asymmetric. In particular, i following j does not need to imply that j follows i, nor does it necessarily imply that j does not follow i . Clustering alone does not capture asymmetric subsets of the relation on the social network because the equivalence relation that characterizes the clustering is symmetric. Similarly, partial ordering alone does not capture symm etric subsets of the relation on the social network because partial orders are antisymmetric. Like in clustering and partial ordering, we work with the assumption t hat the relation on the social network is transitive, i.e., if i follows j and j follows k then i follows k . This assumption simplifies reality and we quantify the deviation empirically. Unlike in clusterin g and partial ordering, we do not assume symmetry or antisymmetry. To model our assump tion we introduce algorithms that output a preorder on a set, i.e., a binary relation on the set that is reflexive and transitiv e. More specifically, we introduce algorithms for the preordering problem: Definition 1. (Wakabayashi, 1998) Given a finite set V and a value c
- North America > United States (0.46)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- Europe > Germany > Saxony > Dresden (0.04)
Untangle the KNOT: Interweaving Conflicting Knowledge and Reasoning Skills in Large Language Models
Liu, Yantao, Yao, Zijun, Lv, Xin, Fan, Yuchen, Cao, Shulin, Yu, Jifan, Hou, Lei, Li, Juanzi
Providing knowledge documents for large language models (LLMs) has emerged as a promising solution to update the static knowledge inherent in their parameters. However, knowledge in the document may conflict with the memory of LLMs due to outdated or incorrect knowledge in the LLMs' parameters. This leads to the necessity of examining the capability of LLMs to assimilate supplemental external knowledge that conflicts with their memory. While previous studies have explained to what extent LLMs extract conflicting knowledge from the provided text, they neglect the necessity to reason with conflicting knowledge. Furthermore, there lack a detailed analysis on strategies to enable LLMs to resolve conflicting knowledge via prompting, decoding strategy, and supervised fine-tuning. To address these limitations, we construct a new dataset, dubbed KNOT, for knowledge conflict resolution examination in the form of question answering. KNOT facilitates in-depth analysis by dividing reasoning with conflicting knowledge into three levels: (1) Direct Extraction, which directly extracts conflicting knowledge to answer questions. (2) Explicit Reasoning, which reasons with conflicting knowledge when the reasoning path is explicitly provided in the question. (3) Implicit Reasoning, where reasoning with conflicting knowledge requires LLMs to infer the reasoning path independently to answer questions. We also conduct extensive experiments on KNOT to establish empirical guidelines for LLMs to utilize conflicting knowledge in complex circumstances. Dataset and associated codes can be accessed at https://github.com/THU-KEG/KNOT .
- Europe > United Kingdom > England > Greater London > London > Waltham Forest (0.04)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Nevada (0.04)
- (5 more...)
Lumos: Heterogeneity-aware Federated Graph Learning over Decentralized Devices
Pan, Qiying, Zhu, Yifei, Chu, Lingyang
Graph neural networks (GNN) have been widely deployed in real-world networked applications and systems due to their capability to handle graph-structured data. However, the growing awareness of data privacy severely challenges the traditional centralized model training paradigm, where a server holds all the graph information. Federated learning is an emerging collaborative computing paradigm that allows model training without data centralization. Existing federated GNN studies mainly focus on systems where clients hold distinctive graphs or sub-graphs. The practical node-level federated situation, where each client is only aware of its direct neighbors, has yet to be studied. In this paper, we propose the first federated GNN framework called Lumos that supports supervised and unsupervised learning with feature and degree protection on node-level federated graphs. We first design a tree constructor to improve the representation capability given the limited structural information. We further present a Monte Carlo Markov Chain-based algorithm to mitigate the workload imbalance caused by degree heterogeneity with theoretically-guaranteed performance. Based on the constructed tree for each client, a decentralized tree-based GNN trainer is proposed to support versatile training. Extensive experiments demonstrate that Lumos outperforms the baseline with significantly higher accuracy and greatly reduced communication cost and training time.
- North America > Canada > Ontario > Hamilton (0.14)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California (0.04)
- Research Report (1.00)
- Overview (0.86)
Adversarial Camouflage for Node Injection Attack on Graphs
Tao, Shuchang, Cao, Qi, Shen, Huawei, Wu, Yunfan, Hou, Liang, Sun, Fei, Cheng, Xueqi
Node injection attacks on Graph Neural Networks (GNNs) have received increasing attention recently, due to their ability to degrade GNN performance with high attack success rates. However, our study indicates that these attacks often fail in practical scenarios, since defense/detection methods can easily identify and remove the injected nodes. To address this, we devote to camouflage node injection attack, making injected nodes appear normal and imperceptible to defense/detection methods. Unfortunately, the non-Euclidean structure of graph data and the lack of intuitive prior present great challenges to the formalization, implementation, and evaluation of camouflage. In this paper, we first propose and define camouflage as distribution similarity between ego networks of injected nodes and normal nodes. Then for implementation, we propose an adversarial CAmouflage framework for Node injection Attack, namely CANA, to improve attack performance under defense/detection methods in practical scenarios. A novel camouflage metric is further designed under the guide of distribution similarity. Extensive experiments demonstrate that CANA can significantly improve the attack performance under defense/detection methods with higher camouflage or imperceptibility. This work urges us to raise awareness of the security vulnerabilities of GNNs in practical applications.
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Florida > Miami-Dade County > Coral Gables (0.04)
On the Validity of Conformal Prediction for Network Data Under Non-Uniform Sampling
We study the properties of conformal prediction for network data under various sampling mechanisms that commonly arise in practice but often result in a non-representative sample of nodes. We interpret these sampling mechanisms as selection rules applied to a superpopulation and study the validity of conformal prediction conditional on an appropriate selection event. We show that the sampled subarray is exchangeable conditional on the selection event if the selection rule satisfies a permutation invariance property and a joint exchangeability condition holds for the superpopulation. Our result implies the finite-sample validity of conformal prediction for certain selection events related to ego networks and snowball sampling. We also show that when data are sampled via a random walk on a graph, a variant of weighted conformal prediction yields asymptotically valid prediction sets for an independently selected node from the population.
- North America > United States > New York (0.04)
- Asia > Singapore (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
- Telecommunications > Networks (0.60)
- Information Technology > Networks (0.60)