AITopics | Zeng, Hanqing

Collaborating Authors

Zeng, Hanqing

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Language Models are Graph Learners

Xu, Zhe, Hassani, Kaveh, Zhang, Si, Zeng, Hanqing, Yasunaga, Michihiro, Wang, Limei, Fu, Dongqi, Yao, Ning, Long, Bo, Tong, Hanghang

arXiv.org Artificial IntelligenceOct-3-2024

Language Models (LMs) are increasingly challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs), in graph learning tasks. Following this trend, we propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks, without requiring any architectural modification. By preserving the LM's original architecture, our approach retains a key benefit of LM instruction tuning: the ability to jointly train on diverse datasets, fostering greater flexibility and efficiency. To achieve this, we introduce two key augmentation strategies: (1) Enriching LMs' input using topological and semantic retrieval methods, which provide richer contextual information, and (2) guiding the LMs' classification process through a lightweight GNN classifier that effectively prunes class candidates. Our experiments on real-world datasets show that backbone Flan-T5 models equipped with these augmentation strategies outperform state-of-the-art text-output node classifiers and are comparable to top-performing vector-output node classifiers. By bridging the gap between specialized task-specific node classifiers and general LMs, this work paves the way for more versatile and widely applicable graph learning models. We will open-source the code upon publication.

large language model, machine learning, node, (16 more...)

arXiv.org Artificial Intelligence

2410.02296

Country:

Europe (1.00)
North America > United States > California (0.46)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)

Add feedback

TASER: Temporal Adaptive Sampling for Fast and Accurate Dynamic Graph Representation Learning

Deng, Gangda, Zhou, Hongkuan, Zeng, Hanqing, Xia, Yinglong, Leung, Christopher, Li, Jianbo, Kannan, Rajgopal, Prasanna, Viktor

arXiv.org Artificial IntelligenceFeb-7-2024

Recently, Temporal Graph Neural Networks (TGNNs) have demonstrated state-of-the-art performance in various high-impact applications, including fraud detection and content recommendation. Despite the success of TGNNs, they are prone to the prevalent noise found in real-world dynamic graphs like time-deprecated links and skewed interaction distribution. The noise causes two critical issues that significantly compromise the accuracy of TGNNs: (1) models are supervised by inferior interactions, and (2) noisy input induces high variance in the aggregated messages. However, current TGNN denoising techniques do not consider the diverse and dynamic noise pattern of each node. In addition, they also suffer from the excessive mini-batch generation overheads caused by traversing more neighbors. We believe the remedy for fast and accurate TGNNs lies in temporal adaptive sampling. In this work, we propose TASER, the first adaptive sampling method for TGNNs optimized for accuracy, efficiency, and scalability. TASER adapts its mini-batch selection based on training dynamics and temporal neighbor selection based on the contextual, structural, and temporal properties of past interactions. To alleviate the bottleneck in mini-batch generation, TASER implements a pure GPU-based temporal neighbor finder and a dedicated GPU feature cache. We evaluate the performance of TASER using two state-of-the-art backbone TGNNs. On five popular datasets, TASER outperforms the corresponding baselines by an average of 2.3% in Mean Reciprocal Rank (MRR) while achieving an average of 5.1x speedup in training time.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2402.05396

Country: North America > United States > California (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science > Data Mining (0.94)

Add feedback

Mixture of Weak & Strong Experts on Graphs

Zeng, Hanqing, Lyu, Hanjia, Hu, Diyi, Xia, Yinglong, Luo, Jiebo

arXiv.org Artificial IntelligenceNov-9-2023

Realistic graphs contain both rich self-features of nodes and informative structures of neighborhoods, jointly handled by a GNN in the typical setup. We propose to decouple the two modalities by mixture of weak and strong experts (Mowst), where the weak expert is a light-weight Multi-layer Perceptron (MLP), and the strong expert is an off-the-shelf Graph Neural Network (GNN). To adapt the experts' collaboration to different target nodes, we propose a "confidence" mechanism based on the dispersion of the weak expert's prediction logits. The strong expert is conditionally activated when either the node's classification relies on neighborhood information, or the weak expert has low model quality. We reveal interesting training dynamics by analyzing the influence of the confidence function on loss: our training algorithm encourages the specialization of each expert by effectively generating soft splitting of the graph. In addition, our "confidence" design imposes a desirable bias toward the strong expert to benefit from GNN's better generalization capability. Mowst is easy to optimize and achieves strong expressive power, with a computation cost comparable to a single GNN. Empirically, Mowst shows significant accuracy improvement on 6 standard node classification benchmarks (including both homophilous and heterophilous graphs).

artificial intelligence, machine learning, mowst, (18 more...)

arXiv.org Artificial Intelligence

2311.05185

Country: North America > United States > California (0.14)

Genre:

Research Report (0.50)
Overview (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

On the Equivalence of Graph Convolution and Mixup

Han, Xiaotian, Zeng, Hanqing, Chen, Yu, Nie, Shaoliang, Liu, Jingzhou, Narang, Kanika, Shakeri, Zahra, Sankararaman, Karthik Abinav, Jiang, Song, Khabsa, Madian, Wang, Qifan, Hu, Xia

arXiv.org Artificial IntelligenceSep-29-2023

This paper investigates the relationship between graph convolution and Mixup techniques. Graph convolution in a graph neural network involves aggregating features from neighboring samples to learn representative features for a specific node or sample. On the other hand, Mixup is a data augmentation technique that generates new examples by averaging features and one-hot labels from multiple samples. One commonality between these techniques is their utilization of information from multiple samples to derive feature representation. This study aims to explore whether a connection exists between these two approaches. Our investigation reveals that, under two mild conditions, graph convolution can be viewed as a specialized form of Mixup that is applied during both the training and testing phases. The two conditions are: 1) \textit{Homophily Relabel} - assigning the target node's label to all its neighbors, and 2) \textit{Test-Time Mixup} - Mixup the feature during the test time. We establish this equivalence mathematically by demonstrating that graph convolution networks (GCN) and simplified graph convolution (SGC) can be expressed as a form of Mixup. We also empirically verify the equivalence by training an MLP using the two conditions to achieve comparable performance.

artificial intelligence, equivalence, graph convolution and mixup

arXiv.org Artificial Intelligence

2310.00183

Genre: Research Report (0.89)

Technology: Information Technology > Artificial Intelligence (0.53)

Add feedback

LLM-Rec: Personalized Recommendation via Prompting Large Language Models

Lyu, Hanjia, Jiang, Song, Zeng, Hanqing, Wang, Qifan, Zhang, Si, Chen, Ren, Leung, Chris, Tang, Jiajie, Xia, Yinglong, Luo, Jiebo

arXiv.org Artificial IntelligenceAug-16-2023

We investigate various prompting strategies for enhancing personalized recommendation performance with large language models (LLMs) through input augmentation. Our proposed approach, termed LLM-Rec, encompasses four distinct prompting strategies: (1) basic prompting, (2) recommendation-driven prompting, (3) engagement-guided prompting, and (4) recommendation-driven + engagement-guided prompting. Our empirical experiments show that incorporating the augmented input text generated by LLM leads to improved recommendation performance. Recommendation-driven and engagement-guided prompting strategies are found to elicit LLM's understanding of global and local item characteristics. This finding highlights the importance of leveraging diverse prompts and input augmentation techniques to enhance the recommendation capabilities with LLMs.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2307.1578

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Decoupling the Depth and Scope of Graph Neural Networks

Zeng, Hanqing, Zhang, Muhan, Xia, Yinglong, Srivastava, Ajitesh, Malevich, Andrey, Kannan, Rajgopal, Prasanna, Viktor, Jin, Long, Chen, Ren

arXiv.org Artificial IntelligenceJan-19-2022

State-of-the-art Graph Neural Networks (GNNs) have limited scalability with respect to the graph and model sizes. On large graphs, increasing the model depth often means exponential expansion of the scope (i.e., receptive field). Beyond just a few layers, two fundamental challenges emerge: 1. degraded expressivity due to oversmoothing, and 2. expensive computation due to neighborhood explosion. We propose a design principle to decouple the depth and scope of GNNs -- to generate representation of a target entity (i.e., a node or an edge), we first extract a localized subgraph as the bounded-size scope, and then apply a GNN of arbitrary depth on top of the subgraph. A properly extracted subgraph consists of a small number of critical neighbors, while excluding irrelevant ones. The GNN, no matter how deep it is, smooths the local neighborhood into informative representation rather than oversmoothing the global graph into "white noise". Theoretically, decoupling improves the GNN expressive power from the perspectives of graph signal processing (GCN), function approximation (GraphSAGE) and topological learning (GIN). Empirically, on seven graphs (with up to 110M nodes) and six backbone GNN architectures, our design achieves significant accuracy improvement with orders of magnitude reduction in computation and hardware cost.

data mining, information technology services, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2201.07858

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

GraphSAINT: Graph Sampling Based Inductive Learning Method

Zeng, Hanqing, Zhou, Hongkuan, Srivastava, Ajitesh, Kannan, Rajgopal, Prasanna, Viktor

arXiv.org Machine LearningJul-10-2019

Graph Convolutional Networks (GCNs) are powerful models for learning representations of attributed graphs.To scale GCNs to large graphs, state-of-the-art methods use various layer sampling techniques to alleviate the "neighbor explosion" problem during minibatch training. Here we proposeGraphSAINT, a graph sampling based inductive learning method that improves training efficiency in a fundamentally different way. By a change of perspective, GraphSAINT constructs minibatches by sampling the training graph, rather than the nodes or edges across GCN layers. Each iteration, a complete GCN is built from the properly sampled subgraph. Thus, we ensure fixed number of well-connected nodes in all layers. We further propose normalization technique to eliminate bias, and sampling algorithms for variance reduction. Importantly, we can decouple the sampling process from the forward and backward propagation of training, and extend GraphSAINT with other graph samplers and GCN variants. Comparing with strong baselines using layer sampling, GraphSAINT demonstrates superior performance in both accuracy and training time on four large graphs.

deep learning, neural network, sampler, (22 more...)

arXiv.org Machine Learning

1907.04931

Country: North America > United States > California (0.29)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)

Add feedback

Accurate, Efficient and Scalable Graph Embedding

Zeng, Hanqing, Zhou, Hongkuan, Srivastava, Ajitesh, Kannan, Rajgopal, Prasanna, Viktor

arXiv.org Machine LearningOct-30-2018

The Graph Convolutional Network (GCN) model and its variants are powerful graph embedding tools for facilitating classification and clustering on graphs. However, a major challenge is to reduce the complexity of layered GCNs and make them parallelizable and scalable on very large graphs --- state-of the art techniques are unable to achieve scalability without losing accuracy and efficiency. In this paper, we propose novel parallelization techniques for graph sampling-based GCNs that achieve superior scalable performance on very large graphs without compromising accuracy. Specifically, our GCN guarantees work-efficient training and produces order of magnitude savings in computation and communication. To scale GCN training on tightly-coupled shared memory systems, we develop parallelization strategies for the key steps in training: For the graph sampling step, we exploit parallelism within and across multiple sampling instances, and devise an efficient data structure for concurrent accesses that provides theoretical guarantee of near-linear speedup with number of processing units. For the feature propagation step within the sampled graph, we improve cache utilization and reduce DRAM communication by data partitioning. We prove that our partitioning strategy is a 2-approximation for minimizing the communication time compared to the optimal strategy. We demonstrate that our parallel graph embedding outperforms state-of-the-art methods in scalability (with respect to number of processors, graph size and GCN model size), efficiency and accuracy on several large datasets. On a 40-core Xeon platform, our parallel training achieves 64$\times$ speedup (with AVX) in the sampling step and 25$\times$ speedup in the feature propagation step, compared to the serial implementation, resulting in a net speedup of 21$\times$.

artificial intelligence, data mining, vertex, (19 more...)

arXiv.org Machine Learning

1810.11899

Country: North America > United States > California (0.46)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Software (0.66)

Add feedback