finite-state automaton layer
Learning Graph Structure With A Finite-State Automaton Layer
Graph-based neural network models are producing strong results in a number of domains, in part because graphs provide flexibility to encode domain knowledge in the form of relational structure (edges) between nodes in the graph. In practice, edges are used both to represent intrinsic structure (e.g., abstract syntax trees of programs) and more abstract relations that aid reasoning for a downstream task (e.g., results of relevant program analyses). In this work, we study the problem of learning to derive abstract relations from the intrinsic graph structure. Motivated by their power in program analyses, we consider relations defined by paths on the base graph accepted by a finite-state automaton. We show how to learn these relations end-to-end by relaxing the problem into learning finite-state automata policies on a graph-based POMDP and then training these policies using implicit differentiation. The result is a differentiable Graph Finite-State Automaton (GFSA) layer that adds a new edge type (expressed as a weighted adjacency matrix) to a base graph. We demonstrate that this layer can find shortcuts in grid-world graphs and reproduce simple static analyses on Python programs. Additionally, we combine the GFSA layer with a larger graph-based model trained end-to-end on the variable misuse program understanding task, and find that using the GFSA layer leads to better performance than using hand-engineered semantic edges or other baseline methods for adding learned edge types.
Review for NeurIPS paper: Learning Graph Structure With A Finite-State Automaton Layer
Additional Feedback: This is great work and should definitely be accepted. A few thoughts and comments on the follow-up: * The fact that observations depend on the initial node (L114) introduces a limited form of "variable capturing" in the regular language that the POMDP tries to approximate. It is still regular, but this can be broadened to depend on the whole agent's history and thus represent more complex languages. The POMDP would no longer be theoretically guaranteed to represent such a language, but it might still learn a useful one. I wonder what would happen if one used GFSA for self-supervised program learning like CodeBERT [4] rather than supervised program analysis tasks.
Review for NeurIPS paper: Learning Graph Structure With A Finite-State Automaton Layer
In graph nets, edges can represent two kinds of relations: ones that follow immediately from the structure of the graph, and ones that are abstract/implicit. The paper proposes to learn the latter. More precisely, it considers relations defined as paths in the base graph accepted by a finite-state automaton, poses the problem of learning these relations as a POMDP problem, and solves a relaxed version of this problem using gradient descent. Overall, the paper was well-received. Pros: Fresh idea Clean formulation Experiments show clear gains in the domains considered The paper is well-written Cons: - Some missing related work - Somewhat narrow application domain The reviewers appreciated the clarifications provided in the author response, in particular the RL experiment for the "Go for a Walk" domain.
Learning Graph Structure With A Finite-State Automaton Layer
Graph-based neural network models are producing strong results in a number of domains, in part because graphs provide flexibility to encode domain knowledge in the form of relational structure (edges) between nodes in the graph. In practice, edges are used both to represent intrinsic structure (e.g., abstract syntax trees of programs) and more abstract relations that aid reasoning for a downstream task (e.g., results of relevant program analyses). In this work, we study the problem of learning to derive abstract relations from the intrinsic graph structure. Motivated by their power in program analyses, we consider relations defined by paths on the base graph accepted by a finite-state automaton. We show how to learn these relations end-to-end by relaxing the problem into learning finite-state automata policies on a graph-based POMDP and then training these policies using implicit differentiation.