Goto

Collaborating Authors

 compositional model


A Factorized Probabilistic Model of the Semantics of Vague Temporal Adverbials Relative to Different Event Types

arXiv.org Artificial Intelligence

V ague temporal adverbials, such as "recently," "just" and "long time ago," describe the temporal distance between a past event and the utterance time, but leave the exact duration underspec-ified. In this paper, we introduce a factorized model that captures the semantics of these adverbials as probabilistic distributions. These distributions are composed with event-specific distributions to yield a contextualized meaning for an adverbial applied to a specific event. We fit the model's parameters using existing data capturing judgements of native speakers regarding the applicability of these vague temporal adverbials to events that took place a given time ago. Comparing our approach to a non-factorized model based on a single Gaussian distribution for each pair of event and temporal adverbial, we find out that, while both models have similar predictive power, our model is preferable in terms of Occam's razor, as it is simpler and has a better extendability.


Compositional Models for Estimating Causal Effects

arXiv.org Artificial Intelligence

Many real-world systems can be represented as sets of interacting components. Examples of such systems include computational systems such as query processors, natural systems such as cells, and social systems such as families. Many approaches have been proposed in traditional (associational) machine learning to model such structured systems, including statistical relational models and graph neural networks. Despite this prior work, existing approaches to estimating causal effects typically treat such systems as single units, represent them with a fixed set of variables and assume a homogeneous data-generating process. We study a compositional approach for estimating individual treatment effects (ITE) in structured systems, where each unit is represented by the composition of multiple heterogeneous components. This approach uses a modular architecture to model potential outcomes at each component and aggregates component-level potential outcomes to obtain the unit-level potential outcomes. We discover novel benefits of the compositional approach in causal inference - systematic generalization to estimate counterfactual outcomes of unseen combinations of components and improved overlap guarantees between treatment and control groups compared to the classical methods for causal effect estimation. We also introduce a set of novel environments for empirically evaluating the compositional approach and demonstrate the effectiveness of our approach using both simulated and real-world data.


Deep Neural Networks for Object Detection

Neural Information Processing Systems

Deep Neural Networks (DNNs) have recently shown outstanding performance on image classification tasks [14]. In this paper we go one step further and address the problem of object detection using DNNs, that is not only classifying but also precisely localizing objects of various classes. We present a simple and yet powerful formulation of object detection as a regression problem to object bounding box masks. We define a multi-scale inference procedure which is able to produce high-resolution object detections at a low cost by a few network applications. State-of-the-art performance of the approach is shown on Pascal VOC.


QNLP in Practice: Running Compositional Models of Meaning on a Quantum Computer

Journal of Artificial Intelligence Research

Quantum Natural Language Processing (QNLP) deals with the design and implementation of NLP models intended to be run on quantum hardware. In this paper, we present results on the first NLP experiments conducted on Noisy Intermediate-Scale Quantum (NISQ) computers for datasets of size greater than 100 sentences. Exploiting the formal similarity of the compositional model of meaning by Coecke, Sadrzadeh, and Clark (2010) with quantum theory, we create representations for sentences that have a natural mapping to quantum circuits. We use these representations to implement and successfully train NLP models that solve simple sentence classification tasks on quantum hardware. We conduct quantum simulations that compare the syntax-sensitive model of Coecke et al. with two baselines that use less or no syntax; specifically, we implement the quantum analogues of a "bag-of-words" model, where syntax is not taken into account at all, and of a word-sequence model, where only word order is respected. We demonstrate that all models converge smoothly both in simulations and when run on quantum hardware, and that the results are the expected ones based on the nature of the tasks and the datasets used. Another important goal of this paper is to describe in a way accessible to AI and NLP researchers the main principles, process and challenges of experiments on quantum hardware. Our aim in doing this is to take the first small steps in this unexplored research territory and pave the way for practical Quantum Natural Language Processing.


lambeq: An Efficient High-Level Python Library for Quantum NLP

arXiv.org Artificial Intelligence

We present lambeq, the first high-level Python library for Quantum Natural Language Processing (QNLP). The open-source toolkit offers a detailed hierarchy of modules and classes implementing all stages of a pipeline for converting sentences to string diagrams, tensor networks, and quantum circuits ready to be used on a quantum computer. lambeq supports syntactic parsing, rewriting and simplification of string diagrams, ansatz creation and manipulation, as well as a number of compositional models for preparing quantum-friendly representations of sentences, employing various degrees of syntax sensitivity. We present the generic architecture and describe the most important modules in detail, demonstrating the usage with illustrative examples. Further, we test the toolkit in practice by using it to perform a number of experiments on simple NLP tasks, implementing both classical and quantum pipelines.


An SMT Based Compositional Model to Solve a Conflict-Free Electric Vehicle Routing Problem

arXiv.org Artificial Intelligence

The Vehicle Routing Problem (VRP) is the combinatorial optimization problem of designing routes for vehicles to visit customers in such a fashion that a cost function, typically the number of vehicles, or the total travelled distance is minimized. The problem finds applications in industrial scenarios, for example where Automated Guided Vehicles run through the plant to deliver components from the warehouse. This specific problem, henceforth called the Electric Conflict-Free Vehicle Routing Problem (CF-EVRP), involves constraints such as limited operating range of the vehicles, time windows on the delivery to the customers, and limited capacity on the number of vehicles the road segments can accommodate at the same time. Such a complex system results in a large model that cannot easily be solved to optimality in reasonable time. We therefore developed a compositional model that breaks down the problem into smaller and simpler sub-problems and provides sub-optimal, feasible solutions to the original problem. The algorithm exploits the strengths of SMT solvers, which proved in our previous work to be an efficient approach to deal with scheduling problems. Compared to a monolithic model for the CF-EVRP, written in the SMT standard language and solved using a state-of-the-art SMT solver the compositional model was found to be significantly faster.


Visual analogy: Deep learning versus compositional models

arXiv.org Artificial Intelligence

Is analogical reasoning a task that must be learned to solve from scratch by applying deep learning models to massive numbers of reasoning problems? Or are analogies solved by computing similarities between structured representations of analogs? We address this question by comparing human performance on visual analogies created using images of familiar three-dimensional objects (cars and their subregions) with the performance of alternative computational models. Human reasoners achieved above-chance accuracy for all problem types, but made more errors in several conditions (e.g., when relevant subregions were occluded). We compared human performance to that of two recent deep learning models (Siamese Network and Relation Network) directly trained to solve these analogy problems, as well as to that of a compositional model that assesses relational similarity between part-based representations. The compositional model based on part representations, but not the deep learning models, generated qualitative performance similar to that of human reasoners.


Projection: A Mechanism for Human-like Reasoning in Artificial Intelligence

arXiv.org Artificial Intelligence

This paper focuses on the first. It encompasses knowledge representation and reasoning, with a focus here on (non-classical) reasoning (a second companion paper will focus on representation). The focus is on the act of reasoning that determines if some data can be seen (or interpreted) as belonging to a particular class, not on long chains of reasoning using diverse knowledge. A significant weakness of Artificial Intelligence (AI) systems relative to humans is the inability to apply existing knowledge to a new problem, or to a situation that varies from what they were programmed for or trained for (also called transfer ability in some contexts). This causes systems to fail to recognise objects or activities in new settings, or to fail to adapt skills to variations (Davis and Marcus, 2015; Ersen et al., 2017).


Neural Composition: Learning to Generate from Multiple Models

arXiv.org Machine Learning

Decomposing models into multiple components is critically important in many applications such as language modeling (LM) as it enables adapting individual components separately and biasing of some components to the user's personal preferences. Conventionally, contextual and personalized adaptation for language models, are achieved through class-based factorization, which requires class-annotated data, or through biasing to individual phrases which is limited in scale. In this paper, we propose a system that combines model-defined components, by learning when to activate the generation process from each individual component, and how to combine probability distributions from each component, directly from unlabeled text data.


Deep Nets: What have they ever done for Vision?

arXiv.org Artificial Intelligence

Deep Nets: What have they ever done for Vision? This is an opinion paper about the strengths and weaknesses of Deep Nets. They are at the center of recent progress on Artificial Intelligence and are of growing importance in Cognitive Science and Neuroscience since they enable the development of computational models that can deal with a large range of visually realistic stimuli and visual tasks. They have clear limitations but they also have enormous successes. There is also gradual, though incomplete, understanding of their inner workings. It seems unlikely that Deep Nets in their current form will be the best long-term solution either for building general purpose intelligent machines or for understanding the mind/brain, but it is likely that many aspects of them will remain. At present Deep Nets do very well on specific types of visual tasks and on specific benchmarked datasets. But Deep Nets are much less general purpose, flexible, and adaptive than the human visual system. Moreover, methods like Deep Nets may run into fundamental difficulties when faced with the enormous complexity of natural images. To illustrate our main points, while keeping the references small, this paper is slightly biased towards work from our group. We are in the third wave of neural network approaches.