Improving Graph Attention Networks with Large Margin-based Constraints

Wang, Guangtao, Ying, Rex, Huang, Jing, Leskovec, Jure

Oct-25-2019–arXiv.org Machine Learning

Graph Attention Networks (GA Ts) are the state-of-the-art neural architecture for representation learning with graphs. GA Ts learn attention functions that assign weights to nodes so that different nodes have different influences in the feature aggregation steps. In practice, however, induced attention functions are prone to over-fitting due to increasing number of parameters and the lack of direct supervision on attention weights. GA Ts also suffer from over-smoothing at the decision boundary of nodes. Here we propose a framework to address their weaknesses via margin-based constraints on attention during training. We first theoretically demonstrate the over-smoothing behavior of GA Ts and then develop an approach using constraint on the attention weights according to the class boundary and feature aggregation pattern. Furthermore, to alleviate the over-fitting problem, we propose additional constraints on graph structure. Extensive experiments and ablation studies on common benchmark datasets demonstrate the effectiveness of our method, which leads to significant improvements over the previous state-of-the-art graph attention methods on all datasets. Introduction Many real world applications involve graph data, like social networks (Zhang and Chen 2018), chemical molecules (Gilmer et al. 2017), and recommender systems (Berg, Kipf, and Welling 2017). The complicated structures of these graphs have inspired new machine learning methods (Cai, Zheng, and Chang 2018; Wu et al. 2019b). Recently much attention and progress has been made on graph neural networks, which have been successfully applied to social network analysis (Battaglia et al. 2016), recommendation systems (Ying et al. 2018), and machine reading comprehension (Tu et al. 2019; De Cao, Aziz, and Titov 2018). Recently, a novel architecture leveraging attention mechanism in Graph Neural Networks (GNNs) called Graph Attention Networks (GA Ts) was introduced (V eli ˇ ckovi c et al. 2017). GA T was motivated by attention mechanism in natural language processing (V aswani et al. 2017; Devlin et al. 2018).

constraint, graph, node, (14 more...)

arXiv.org Machine Learning

Oct-25-2019

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > Santa Clara County
    - Palo Alto (0.04)
    - Stanford (0.04)
    - Mountain View (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Information Technology (0.54)
- Education (0.34)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found