Logic and the $2$-Simplicial Transformer

Clift, James, Doryn, Dmitry, Murfet, Daniel, Wallbridge, James

arXiv.org Machine Learning 

The most successful examples of such representations, those learned by convolutional neural networks, are structured by the scale and translational symmetries of the underlying space (e.g. a two-dimensional Euclidean space for images). It has been suggested that in humans the ability to make rich inferences based on abstract reasoning is rooted in the same neural mechanisms underlying relational reasoning in space [16, 19, 6, 7] and more specifically that abstract reasoning is facilitated by the learning of structural representations which serve to organise other learned representations in the same way that space organises the representations that enable spatial navigation [68, 41]. This raises a natural question: are there any ideas from mathematics that might be useful in designing general inductive biases for learning such structural representations? As a motivating example we take the recent progress on natural language tasks based on the Transformer architecture [66] which simultaneously learns to represent both entities (typically words) and relations between entities (for instance the relation between "cat" and "he" in the sentence "There was a cat and he liked to sleep"). These representations of relations take the form of query and key vectors governing the passing of messages between entities; messages update entity representations over several rounds of computation until the final representations reflect not just the meaning of words but also their context in a sentence.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found