Goto

Collaborating Authors

 Fu, Guoji


Constrained Layout Generation with Factor Graphs

arXiv.org Artificial Intelligence

This paper addresses the challenge of object-centric layout generation under spatial constraints, seen in multiple domains including floorplan design process. The design process typically involves specifying a set of spatial constraints that include object attributes like size and inter-object relations such as relative positioning. Existing works, which typically represent objects as single nodes, lack the granularity to accurately model complex interactions between objects. F or instance, often only certain parts of an object, like a room's right wall, interact with adjacent objects. T o address this gap, we introduce a factor graph based approach with four latent variable nodes for each room, and a factor node for each constraint. The factor nodes represent dependencies among the variables to which they are connected, effectively capturing constraints that are potentially of a higher order . W e then develop message-passing on the bipartite graph, forming a factor graph neural network that is trained to produce a floorplan that aligns with the desired requirements. Our approach is simple and generates layouts faithful to the user requirements, demonstrated by a large improvement in IOU scores over existing methods. Additionally, our approach, being inferential and accurate, is well-suited to the practical human-in-the-loop design process where specifications evolve iteratively, offering a practical and powerful tool for AI-guided design.


Implicit Graph Neural Diffusion Based on Constrained Dirichlet Energy Minimization

arXiv.org Artificial Intelligence

Implicit graph neural networks (GNNs) have emerged as a potential approach to enable GNNs to capture long-range dependencies effectively. However, poorly designed implicit GNN layers can experience over-smoothing or may have limited adaptability to learn data geometry, potentially hindering their performance in graph learning problems. To address these issues, we introduce a geometric framework to design implicit graph diffusion layers based on a parameterized graph Laplacian operator. Our framework allows learning the geometry of vertex and edge spaces, as well as the graph gradient operator from data. We further show how implicit GNN layers can be viewed as the fixed-point solution of a Dirichlet energy minimization problem and give conditions under which it may suffer from over-smoothing. To overcome the over-smoothing problem, we design our implicit graph diffusion layer as the solution of a Dirichlet energy minimization problem with constraints on vertex features, enabling it to trade off smoothing with the preservation of node feature information. With an appropriate hyperparameter set to be larger than the largest eigenvalue of the parameterized graph Laplacian, our framework guarantees a unique equilibrium and quick convergence. Our models demonstrate better performance than leading implicit and explicit GNNs on benchmark datasets for node and graph classification tasks, with substantial accuracy improvements observed for some datasets.


Fairness-aware Message Passing for Graph Neural Networks

arXiv.org Artificial Intelligence

Graph Neural Networks (GNNs) have shown great power in various domains. However, their predictions may inherit societal biases on sensitive attributes, limiting their adoption in real-world applications. Although many efforts have been taken for fair GNNs, most existing works just adopt widely used fairness techniques in machine learning to graph domains and ignore or don't have a thorough understanding of the message passing mechanism with fairness constraints, which is a distinctive feature of GNNs. To fill the gap, we propose a novel fairness-aware message passing framework GMMD, which is derived from an optimization problem that considers both graph smoothness and representation fairness. GMMD can be intuitively interpreted as encouraging a node to aggregate representations of other nodes from different sensitive groups while subtracting representations of other nodes from the same sensitive group, resulting in fair representations. We also provide a theoretical analysis to justify that GMMD can guarantee fairness, which leads to a simpler and theory-guided variant GMMD-S. Extensive experiments on graph benchmarks show that our proposed framework can significantly improve the fairness of various backbone GNN models while maintaining high accuracy.


SyNDock: N Rigid Protein Docking via Learnable Group Synchronization

arXiv.org Artificial Intelligence

The regulation of various cellular processes heavily relies on the protein complexes within a living cell, necessitating a comprehensive understanding of their three-dimensional structures to elucidate the underlying mechanisms. While neural docking techniques have exhibited promising outcomes in binary protein docking, the application of advanced neural architectures to multimeric protein docking remains uncertain. This study introduces SyNDock, an automated framework that swiftly assembles precise multimeric complexes within seconds, showcasing performance that can potentially surpass or be on par with recent advanced approaches. SyNDock possesses several appealing advantages not present in previous approaches. Firstly, SyNDock formulates multimeric protein docking as a problem of learning global transformations to holistically depict the placement of chain units of a complex, enabling a learning-centric solution. Secondly, SyNDock proposes a trainable two-step SE(3) algorithm, involving initial pairwise transformation and confidence estimation, followed by global transformation synchronization. This enables effective learning for assembling the complex in a globally consistent manner. Lastly, extensive experiments conducted on our proposed benchmark dataset demonstrate that SyNDock outperforms existing docking software in crucial performance metrics, including accuracy and runtime. For instance, it achieves a 4.5% improvement in performance and a remarkable millionfold acceleration in speed.


Recent Advances in Reliable Deep Graph Learning: Inherent Noise, Distribution Shift, and Adversarial Attack

arXiv.org Artificial Intelligence

Deep graph learning (DGL) has achieved remarkable progress in both business and scientific areas ranging from finance and e-commerce to drug and advanced material discovery. Despite the progress, applying DGL to real-world applications faces a series of reliability threats including inherent noise, distribution shift, and adversarial attacks. This survey aims to provide a comprehensive review of recent advances for improving the reliability of DGL algorithms against the above threats. In contrast to prior related surveys which mainly focus on adversarial attacks and defense, our survey covers more reliability-related aspects of DGL, i.e., inherent noise and distribution shift. Additionally, we discuss the relationships among above aspects and highlight some important issues to be explored in future research.


What Has Been Enhanced in my Knowledge-Enhanced Language Model?

arXiv.org Artificial Intelligence

Pretrained language models (LMs) do not capture factual knowledge very well. This has led to the development of a number of knowledge integration (KI) methods which aim to incorporate external knowledge into pretrained LMs. Even though KI methods show some performance gains over vanilla LMs, the inner-workings of these methods are not well-understood. For instance, it is unclear how and what kind of knowledge is effectively integrated into these models and if such integration may lead to catastrophic forgetting of already learned knowledge. This paper revisits the KI process in these models with an information-theoretic view and shows that KI can be interpreted using a graph convolution operation. We propose a probe model called \textit{Graph Convolution Simulator} (GCS) for interpreting knowledge-enhanced LMs and exposing what kind of knowledge is integrated into these models. We conduct experiments to verify that our GCS can indeed be used to correctly interpret the KI process, and we use it to analyze two well-known knowledge-enhanced LMs: ERNIE and K-Adapter, and find that only a small amount of factual knowledge is integrated in them. We stratify knowledge in terms of various relation types and find that ERNIE and K-Adapter integrate different kinds of knowledge to different extent. Our analysis also shows that simply increasing the size of the KI corpus may not lead to better KI; fundamental advances may be needed.


$p$-Laplacian Based Graph Neural Networks

arXiv.org Machine Learning

Graph neural networks (GNNs) have demonstrated superior performance for semisupervised node classification on graphs, as a result of their ability to exploit node features and topological information simultaneously. However, most GNNs implicitly assume that the labels of nodes and their neighbors in a graph are the same or consistent, which does not hold in heterophilic graphs, where the labels of linked nodes are likely to differ. Hence, when the topology is non-informative for label prediction, ordinary GNNs may work significantly worse than simply applying multi-layer perceptrons (MLPs) on each node. GNN, whose message passing mechanism is derived from a discrete regularization framework and could be theoretically explained as an approximation of a polynomial graph filter defined on the spectral domain of p-Laplacians. GNNs significantly outperform several state-of-the-art GNN architectures on heterophilic benchmarks while achieving competitive performance on homophilic benchmarks. GNNs can adaptively learn aggregation weights and are robust to noisy edges. In this paper, we explore the usage of graph neural networks (GNNs) for semi-supervised node classification on graphs, especially when the graphs admit strong heterophily or noisy edges. Semisupervised learning problems on graphs are ubiquitous in a lot of real-world scenarios, such as user classification in social media (Kipf & Welling, 2017), protein classification in biology (Velickovic et al., 2018), molecular property prediction in chemistry (Duvenaud et al., 2015), and many others (Marcheggiani & Titov, 2017; Satorras & Estrach, 2018). Recently, GNNs are becoming the de facto choice for processing graph structured data.


Understanding Graph Neural Networks from Graph Signal Denoising Perspectives

arXiv.org Machine Learning

Graph neural networks (GNNs) have attracted much attention because of their excellent performance on tasks such as node classification. However, there is inadequate understanding on how and why GNNs work, especially for node representation learning. This paper aims to provide a theoretical framework to understand GNNs, specifically, spectral graph convolutional networks and graph attention networks, from graph signal denoising perspectives. Our framework shows that GNNs are implicitly solving graph signal denoising problems: spectral graph convolutions work as denoising node features, while graph attentions work as denoising edge weights. We also show that a linear self-attention mechanism is able to compete with the state-of-the-art graph attention methods. Our theoretical results further lead to two new models, GSDN-F and GSDN-EF, which work effectively for graphs with noisy node features and/or noisy edges. We validate our theoretical findings and also the effectiveness of our new models by experiments on benchmark datasets. The source code is available at \url{https://github.com/fuguoji/GSDN}.


Learning Topological Representation for Networks via Hierarchical Sampling

arXiv.org Machine Learning

Abstract--The topological information is essential for studying the relationship between nodes in a network. Recently, Network Representation Learning (NRL), which projects a network into a low-dimensional vector space, has been shown their advantages inanalyzing large-scale networks. However, most existing NRL methods are designed to preserve the local topology of a network, they fail to capture the global topology. To tackle this issue, we propose a new NRL framework, named HSRL, to help existing NRL methods capture both the local and global topological information of a network. Then, an existing NRL method is used to learn node embeddings for each compressed network. Finally, the node embeddings of the input network are obtained by concatenating the node embeddings from all compressed networks. Empirical studies for link prediction on five real-world datasets demonstrate the advantages of HSRL over state-of-the-art methods. I. INTRODUCTION The science of networks has been widely used to understand thebehaviours of complex systems.


Representation Learning for Heterogeneous Information Networks via Embedding Events

arXiv.org Machine Learning

Network representation learning (NRL) has been widely used to help analyze large-scale networks through mapping original networks into a low-dimensional vector space. However, existing NRL methods ignore the impact of properties of relations on the object relevance in heterogeneous information networks (HINs). To tackle this issue, this paper proposes a new NRL framework, called Event2vec, for HINs to consider both quantities and properties of relations during the representation learning process. Specifically, an event (i.e., a complete semantic unit) is used to represent the relation among multiple objects, and both event-driven first-order and second-order proximities are defined to measure the object relevance according to the quantities and properties of relations. We theoretically prove how event-driven proximities can be preserved in the embedding space by Event2vec, which utilizes event embeddings to facilitate learning the object embeddings. Experimental studies demonstrate the advantages of Event2vec over state-of-the-art algorithms on four real-world datasets and three network analysis tasks (including network reconstruction, link prediction, and node classification).