Can Transformers Reason Logically? A Study in SAT Solving

Pan, Leyan, Ganesh, Vijay, Abernethy, Jacob, Esposo, Chris, Lee, Wenke

arXiv.org Artificial Intelligence 

A PARAT "program" is basically a sequence of array operations over SOps. Throughout this section, we refer to the indices along the first dimension of an SOp as "position" and refer to indices along the second dimension as "dimension". The "inputs" to a program are arbitrary positional encoding and token embedding SOps, represented by the base class names PosEncSOp and TokEmbSOp respectively. For example, the OneHotTokEmb class represents the one-hot embedding of tokens and Indices represents the numerical value of the index of each position. The rest of the program performs various operations that compute new SOps based on existing ones. We provide implementations of basic building block operations including (but not limited to) the following: Mean(q, k, v) Represents the "Averaging Hard Attention" operation.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found