Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving

Schlag, Imanol, Smolensky, Paul, Fernandez, Roland, Jojic, Nebojsa, Schmidhuber, Jürgen, Gao, Jianfeng

Oct-15-2019–arXiv.org Machine Learning

A BSTRACT We incorporate Tensor-Product Representations within the Transformer in order to better support the explicit representation of relation structure. Our Tensor-Product Transformer (TP-Transformer) sets a new state of the art on the recently-introduced Mathematics Dataset containing 56 categories of free-form math word-problems. The essential component of the model is a novel attention mechanism, called TP-Attention, which explicitly encodes the relations between each Transformer cell and the other cells from which values have been retrieved by attention. TP-Attention goes beyond linear combination of retrieved values, strengthening representation-building and resolving ambiguities introduced by multiple layers of standard attention. The TP-Transformer's attention maps give better insights into how it is capable of solving the Mathematics Dataset's challenging problems. Pretrained models and code will be made available after publication. 1 I NTRODUCTION In this paper we propose a variation of the Transformer (V aswani et al., 2017) that is designed to allow it to better incorporate structure into its representations. We test the proposal on a task where structured representations are expected to be particularly helpful: math word-problem solving, where, among other things, correctly parsing expressions and compositionally evaluating them is crucial.

deep learning, neural network, representation, (19 more...)

arXiv.org Machine Learning

Oct-15-2019

arXiv.org PDF

Add feedback

Genre:
- Research Report (0.50)

Technology:
- Information Technology > Artificial Intelligence
  - Cognitive Science (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)
  - Natural Language > Grammars & Parsing (0.86)
  - Representation & Reasoning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found