representation graph
Building, Reusing, and Generalizing Abstract Representations from Concrete Sequences
Wu, Shuchen, Thalmann, Mirko, Dayan, Peter, Akata, Zeynep, Schulz, Eric
Humans excel at learning abstract patterns across different sequences, filtering out irrelevant details, and transferring these generalized concepts to new sequences. In contrast, many sequence learning models lack the ability to abstract, which leads to memory inefficiency and poor transfer. We introduce a non-parametric hierarchical variable learning model (HVM) that learns chunks from sequences and abstracts contextually similar chunks as variables. HVM efficiently organizes memory while uncovering abstractions, leading to compact sequence representations. When learning on language datasets such as babyLM, HVM learns a more efficient dictionary than standard compression algorithms such as Lempel-Ziv. In a sequence recall task requiring the acquisition and transfer of variables embedded in sequences, we demonstrate HVM's sequence likelihood correlates with human recall times. In contrast, large language models (LLMs) struggle to transfer abstract variables as effectively as humans. From HVM's adjustable layer of abstraction, we demonstrate that the model realizes a precise trade-off between compression and generalization. Our work offers a cognitive model that captures the learning and transfer of abstract representations in human cognition and differentiates itself from the behavior of large language models.
Deep Reinforcement Learning Based on Local GNN for Goal-conditioned Deformable Object Rearranging
Deng, Yuhong, Xia, Chongkun, Wang, Xueqian, Chen, Lipeng
Object rearranging is one of the most common deformable manipulation tasks, where the robot needs to rearrange a deformable object into a goal configuration. Previous studies focus on designing an expert system for each specific task by model-based or data-driven approaches and the application scenarios are therefore limited. Some research has been attempting to design a general framework to obtain more advanced manipulation capabilities for deformable rearranging tasks, with lots of progress achieved in simulation. However, transferring from simulation to reality is difficult due to the limitation of the end-to-end CNN architecture. To address these challenges, we design a local GNN (Graph Neural Network) based learning method, which utilizes two representation graphs to encode keypoints detected from images. Self-attention is applied for graph updating and cross-attention is applied for generating manipulation actions. Extensive experiments have been conducted to demonstrate that our framework is effective in multiple 1-D (rope, rope ring) and 2-D (cloth) rearranging tasks in simulation and can be easily transferred to a real robot by fine-tuning a keypoint detector.
What does Transformer learn about source code?
Zhang, Kechi, Li, Ge, Jin, Zhi
In the field of source code processing, the transformer-based representation models have shown great powerfulness and have achieved state-of-the-art (SOTA) performance in many tasks. Although the transformer models process the sequential source code, pieces of evidence show that they may capture the structural information (\eg, in the syntax tree, data flow, control flow, \etc) as well. We propose the aggregated attention score, a method to investigate the structural information learned by the transformer. We also put forward the aggregated attention graph, a new way to extract program graphs from the pre-trained models automatically. We measure our methods from multiple perspectives. Furthermore, based on our empirical findings, we use the automatically extracted graphs to replace those ingenious manual designed graphs in the Variable Misuse task. Experimental results show that the semantic graphs we extracted automatically are greatly meaningful and effective, which provide a new perspective for us to understand and use the information contained in the model.
On consistency of constrained spectral clustering under representation-aware stochastic block model
Gupta, Shubham, Dukkipati, Ambedkar
Spectral clustering is widely used in practice due to its flexibility, computational efficiency, and well-understood theoretical performance guarantees. Recently, spectral clustering has been studied to find balanced clusters under population-level constraints. These constraints are specified by additional information available in the form of auxiliary categorical node attributes. In this paper, we consider a scenario where these attributes may not be observable, but manifest as latent features of an auxiliary graph. Motivated by this, we study constrained spectral clustering with the aim of finding balanced clusters in a given \textit{similarity graph} $\mathcal{G}$, such that each individual is adequately represented with respect to an auxiliary graph $\mathcal{R}$ (we refer to this as representation graph). We propose an individual-level balancing constraint that formalizes this idea. Our work leads to an interesting stochastic block model that not only plants the given partitions in $\mathcal{G}$ but also plants the auxiliary information encoded in the representation graph $\mathcal{R}$. We develop unnormalized and normalized variants of spectral clustering in this setting. These algorithms use $\mathcal{R}$ to find clusters in $\mathcal{G}$ that approximately satisfy the proposed constraint. We also establish the first statistical consistency result for constrained spectral clustering under individual-level constraints for graphs sampled from the above-mentioned variant of the stochastic block model. Our experimental results corroborate our theoretical findings.
Detecting out-of-context objects using contextual cues
Acharya, Manoj, Roy, Anirban, Koneripalli, Kaushik, Jha, Susmit, Kanan, Christopher, Divakaran, Ajay
This paper presents an approach to detect out-of-context (OOC) objects in an image. Given an image with a set of objects, our goal is to determine if an object is inconsistent with the scene context and detect the OOC object with a bounding box. In this work, we consider commonly explored contextual relations such as co-occurrence relations, the relative size of an object with respect to other objects, and the position of the object in the scene. We posit that contextual cues are useful to determine object labels for in-context objects and inconsistent context cues are detrimental to determining object labels for out-of-context objects. To realize this hypothesis, we propose a graph contextual reasoning network (GCRN) to detect OOC objects. GCRN consists of two separate graphs to predict object labels based on the contextual cues in the image: 1) a representation graph to learn object features based on the neighboring objects and 2) a context graph to explicitly capture contextual cues from the neighboring objects. GCRN explicitly captures the contextual cues to improve the detection of in-context objects and identify objects that violate contextual relations. In order to evaluate our approach, we create a large-scale dataset by adding OOC object instances to the COCO images. We also evaluate on recent OCD benchmark. Our results show that GCRN outperforms competitive baselines in detecting OOC objects and correctly detecting in-context objects.
NeuralLog: Natural Language Inference with Joint Neural and Logical Reasoning
Chen, Zeming, Gao, Qiyue, Moss, Lawrence S.
Deep learning (DL) based language models achieve high performance on various benchmarks for Natural Language Inference (NLI). And at this time, symbolic approaches to NLI are receiving less attention. Both approaches (symbolic and DL) have their advantages and weaknesses. However, currently, no method combines them in a system to solve the task of NLI. To merge symbolic and deep learning methods, we propose an inference framework called NeuralLog, which utilizes both a monotonicity-based logical inference engine and a neural network language model for phrase alignment. Our framework models the NLI task as a classic search problem and uses the beam search algorithm to search for optimal inference paths. Experiments show that our joint logic and neural inference system improves accuracy on the NLI task and can achieve state-of-art accuracy on the SICK and MED datasets.
Protecting Individual Interests across Clusters: Spectral Clustering with Guarantees
Gupta, Shubham, Dukkipati, Ambedkar
Studies related to fairness in machine learning have recently gained traction due to its ever-expanding role in high-stakes decision making. For example, it may be desirable to ensure that all clusters discovered by an algorithm have high gender diversity. Previously, these problems have been studied under a setting where sensitive attributes, with respect to which fairness conditions impose diversity across clusters, are assumed to be observable; hence, protected groups are readily available. Most often, this may not be true, and diversity or individual interests can manifest as an intrinsic or latent feature of a social network. For example, depending on latent sensitive attributes, individuals interact with each other and represent each other's interests, resulting in a network, which we refer to as a representation graph. Motivated by this, we propose an individual fairness criterion for clustering a graph $\mathcal{G}$ that requires each cluster to contain an adequate number of members connected to the individual under a representation graph $\mathcal{R}$. We devise a spectral clustering algorithm to find fair clusters under a given representation graph. We further propose a variant of the stochastic block model and establish our algorithm's weak consistency under this model. Finally, we present experimental results to corroborate our theoretical findings.
Bi-Level Graph Neural Networks for Drug-Drug Interaction Prediction
Bai, Yunsheng, Gu, Ken, Sun, Yizhou, Wang, Wei
We introduce Bi-GNN for modeling biological link prediction tasks such as drug-drug interaction (DDI) and protein-protein interaction (PPI). Taking drug-drug interaction as an example, existing methods using machine learning either only utilize the link structure between drugs without using the graph representation of each drug molecule, or only leverage the individual drug compound structures without using graph structure for the higher-level DDI graph. The key idea of our method is to fundamentally view the data as a bi-level graph, where the highest level graph represents the interaction between biological entities (interaction graph), and each biological entity itself is further expanded to its intrinsic graph representation (representation graphs), where the graph is either flat like a drug compound or hierarchical like a protein with amino acid level graph, secondary structure, tertiary structure, etc. Our model not only allows the usage of information from both the high-level interaction graph and the low-level representation graphs, but also offers a baseline for future research opportunities to address the bi-level nature of the data.