Goto

Collaborating Authors

 Marsden, Dan


Towards Compositional Distributional Discourse Analysis

arXiv.org Artificial Intelligence

In the last couple of decades, the traditional symbolic approach to AI and cognitive science -- which aims at characterising human intelligence in terms of abstract logical processes -- has been challenged by so-called connectionist AI: the study of the human brain as a complex network of basic processing units [18]. When it comes to human language, the same divide manifests itself as the opposition between two principles, which in turn induce two distinct approaches to Natural Language Processing (NLP). On one hand Frege's principle of compositionality asserts that the meaning of a complex expression is a function of its sub-expressions, and the way in which they are composed -- distributionality on the other hand can be summed up in Firth's maxim "You shall know a word by the company it keeps". Once implemented in terms of concrete algorithms we have expert systems driven by formal logical rules on one end, artificial neural networks and machine learning on the other. Categorical Compositional Distributional (DisCoCat) models, first introduced in [4], aim at getting the best of both worlds: the string diagrams notation borrowed from category theory allows to manipulate the grammatical reductions as linear maps, and compute graphically the semantics of a sentence as the composition of the vectors which we obtain from the distributional semantics of its constituent words. In this paper, we introduce basic anaphoric discourses as mid-level representations between natural language discourse on one end -- formalised in terms of basic discourse representation structures (DRS) [2]; and knowledge queries over the Semantic Web on the other -- given by basic graph patterns in the Resource Description Framework (RDF) [19]. We construct discourses as formal diagrams of real-valued matrices and we then use these diagrams to give abstract reformulations of NLP problems: probabilistic anaphora resolution and question answering.


Internal Wiring of Cartesian Verbs and Prepositions

arXiv.org Artificial Intelligence

Categorical compositional distributional semantics (CCDS) allows one to compute the meaning of phrases and sentences from the meaning of their constituent words. A type-structure carried over from the traditional categorial model of grammar a la Lambek becomes a 'wire-structure' that mediates the interaction of word meanings. However, CCDS has a much richer logical structure than plain categorical semantics in that certain words can also be given an 'internal wiring' that either provides their entire meaning or reduces the size their meaning space. Previous examples of internal wiring include relative pronouns and intersective adjectives. Here we establish the same for a large class of well-behaved transitive verbs to which we refer as Cartesian verbs, and reduce the meaning space from a ternary tensor to a unary one. Some experimental evidence is also provided.


Proceedings of the 2018 Workshop on Compositional Approaches in Physics, NLP, and Social Sciences

arXiv.org Artificial Intelligence

The ability to compose parts to form a more complex whole, and to analyze a whole as a combination of elements, is desirable across disciplines. This workshop bring together researchers applying compositional approaches to physics, NLP, cognitive science, and game theory. Within NLP, a long-standing aim is to represent how words can combine to form phrases and sentences. Within the framework of distributional semantics, words are represented as vectors in vector spaces. The categorical model of Coecke et al. [2010], inspired by quantum protocols, has provided a convincing account of compositionality in vector space models of NLP. There is furthermore a history of vector space models in cognitive science. Theories of categorization such as those developed by Nosofsky [1986] and Smith et al. [1988] utilise notions of distance between feature vectors. More recently G\"ardenfors [2004, 2014] has developed a model of concepts in which conceptual spaces provide geometric structures, and information is represented by points, vectors and regions in vector spaces. The same compositional approach has been applied to this formalism, giving conceptual spaces theory a richer model of compositionality than previously [Bolt et al., 2018]. Compositional approaches have also been applied in the study of strategic games and Nash equilibria. In contrast to classical game theory, where games are studied monolithically as one global object, compositional game theory works bottom-up by building large and complex games from smaller components. Such an approach is inherently difficult since the interaction between games has to be considered. Research into categorical compositional methods for this field have recently begun [Ghani et al., 2018]. Moreover, the interaction between the three disciplines of cognitive science, linguistics and game theory is a fertile ground for research. Game theory in cognitive science is a well-established area [Camerer, 2011]. Similarly game theoretic approaches have been applied in linguistics [J\"ager, 2008]. Lastly, the study of linguistics and cognitive science is intimately intertwined [Smolensky and Legendre, 2006, Jackendoff, 2007]. Physics supplies compositional approaches via vector spaces and categorical quantum theory, allowing the interplay between the three disciplines to be examined.