Reasoning is essential for the development of large knowledge graphs, especially for completion, which aims to infer new triples based on existing ones. Both rules and embeddings can be used for knowledge graph reasoning and they have their own advantages and difficulties. Rule-based reasoning is accurate and explainable but rule learning with searching over the graph always suffers from efficiency due to huge search space. Embedding-based reasoning is more scalable and efficient as the reasoning is conducted via computation between embeddings, but it has difficulty learning good representations for sparse entities because a good embedding relies heavily on data richness. Based on this observation, in this paper we explore how embedding and rule learning can be combined together and complement each other's difficulties with their advantages. We propose a novel framework IterE iteratively learning embeddings and rules, in which rules are learned from embeddings with proper pruning strategy and embeddings are learned from existing triples and new triples inferred by rules. Evaluations on embedding qualities of IterE show that rules help improve the quality of sparse entity embeddings and their link prediction results. We also evaluate the efficiency of rule learning and quality of rules from IterE compared with AMIE+, showing that IterE is capable of generating high quality rules more efficiently. Experiments show that iteratively learning embeddings and rules benefit each other during learning and prediction.
In this paper, we present exploitability descent, a new algorithm to compute approximate equilibria in two-player zero-sum extensive-form games with imperfect information, by direct policy optimization against worst-case opponents. We prove that when following this optimization, the exploitability of a player's strategy converges asymptotically to zero, and hence when both players employ this optimization, the joint policies converge to a Nash equilibrium. Unlike fictitious play (XFP) and counterfactual regret minimization (CFR), our convergence result pertains to the policies being optimized rather than the average policies. Our experiments demonstrate convergence rates comparable to XFP and CFR in four benchmark games in the tabular case. Using function approximation, we find that our algorithm outperforms the tabular version in two of the games, which, to the best of our knowledge, is the first such result in imperfect information games among this class of algorithms.
Dialogue systems have become recently essential in our life. Their use is getting more and more fluid and easy throughout the time. This boils down to the improvements made in NLP and AI fields. In this paper, we try to provide an overview to the current state of the art of dialogue systems, their categories and the different approaches to build them. We end up with a discussion that compares all the techniques and analyzes the strengths and weaknesses of each. Finally, we present an opinion piece suggesting to orientate the research towards the standardization of dialogue systems building.
This paper describes our system submitted to SemEval 2019 Task 7: RumourEval 2019: Determining Rumour Veracity and Support for Rumours, Subtask A (Gorrell et al., 2019). The challenge focused on classifying whether posts from Twitter and Reddit support, deny, query, or comment a hidden rumour, truthfulness of which is the topic of an underlying discussion thread. We formulate the problem as a stance classification, determining the rumour stance of a post with respect to the previous thread post and the source thread post. The recent BERT architecture was employed to build an end-to-end system which has reached the F1 score of 61.67% on the provided test data. It finished at the 2nd place in the competition, without any hand-crafted features, only 0.2% behind the winner.
The no-wait flowshop scheduling problem is a variant of the classical permutation flowshop problem, with the additional constraint that jobs have to be processed by the successive machines without waiting time. To efficiently address this NP-hard combinatorial optimization problem we conduct an analysis of the structure of good quality solutions. This analysis shows that the No-Wait specificity gives them a common structure: they share identical sub-sequences of jobs, we call super-jobs. After a discussion on the way to identify these super-jobs, we propose IG-SJ, an algorithm that exploits super-jobs within the state-of-the-art algorithm for the classical permutation flowshop, the well-known Iterated Greedy (IG) algorithm. An iterative approach of IG-SJ is also proposed. Experiments are conducted on Taillard's instances. The experimental results show that exploiting super-jobs is successful since IG-SJ is able to find 64 new best solutions.
We study strategy synthesis for partially observable Markov decision processes (POMDPs). The particular problem is to determine strategies that provably adhere to (probabilistic) temporal logic constraints. This problem is computationally intractable and theoretically hard. We propose a novel method that combines techniques from machine learning and formal verification. First, we train a recurrent neural network (RNN) to encode POMDP strategies. The RNN accounts for memory-based decisions without the need to expand the full belief space of a POMDP. Secondly, we restrict the RNN-based strategy to represent a finite-memory strategy and implement it on a specific POMDP. For the resulting finite Markov chain, efficient formal verification techniques provide provable guarantees against temporal logic specifications. If the specification is not satisfied, counterexamples supply diagnostic information. We use this information to improve the strategy by iteratively training the RNN. Numerical experiments show that the proposed method elevates the state of the art in POMDP solving by up to three orders of magnitude in terms of solving times and model sizes.
Weighted model integration (WMI) extends Weighted model counting (WMC) to the integration of functions over mixed discrete-continuous domains. It has shown tremendous promise for solving inference problems in graphical models and probabilistic programming. Yet, state-of-the-art tools for WMI are limited in terms of performance and ignore the independence structure that is crucial to improving efficiency. To address this limitation, we propose an efficient model integration algorithm for theories with tree primal graphs. We exploit the sparse graph structure by using search to performing integration. Our algorithm greatly improves the computational efficiency on such problems and exploits context-specific independence between variables. Experimental results show dramatic speedups compared to existing WMI solvers on problems with tree-shaped dependencies.
The most common failure algorithms for control, employs three techniques mode is divergence, where the Q-function approximator collectively known as the'deadly triad' in learns to ascribe unrealistically high values to state-action reinforcement learning: bootstrapping, off-policy pairs, in turn destroying the quality of the greedy control learning, and function approximation. Prior work policy derived from Q (van Hasselt et al., 2018). Divergence has demonstrated that together these can lead to in DQL is often attributed to three components common divergence in Q-learning algorithms, but the conditions to all DQL algorithms, which are collectively considered under which divergence occurs are not the'deadly triad' of reinforcement learning (Sutton, 1988; well-understood. In this note, we give a simple Sutton & Barto, 2018): analysis based on a linear approximation to the Q-value updates, which we believe provides insight - function approximation, in this case the use of deep into divergence under the deadly triad. The neural networks, central point in our analysis is to consider when the leading order approximation to the deep-Q - off-policy learning, the use of data collected on one update is or is not a contraction in the sup norm.
This paper proposes using a linear function approximator, rather than a deep neural network (DNN), to bias a Monte Carlo tree search (MCTS) player for general games. This is unlikely to match the potential raw playing strength of DNNs, but has advantages in terms of generality, interpretability and resources (time and hardware) required for training. Features describing local patterns are used as inputs. The features are formulated in such a way that they are easily interpretable and applicable to a wide range of general games, and might encode simple local strategies. We gradually create new features during the same self-play training process used to learn feature weights. We evaluate the playing strength of an MCTS player biased by learnt features against a standard upper confidence bounds for trees (UCT) player in multiple different board games, and demonstrate significantly improved playing strength in the majority of them after a small number of self-play training games.
In the context of vandalism and terrorist activities, video surveillance forms an integral part of any incident investigation and, thus, there is a critical need for developing an "automated video surveillance system" with the capability of detecting complex events to aid the forensic investigators in solving the criminal cases. As an example, in the aftermath of the London riots in August 2011 police had to scour through more than 200,000 hours of CCTV videos to identify suspects. Around 5,000 offenders were found by trawling through the footage, after a process that took more than five months. With the aim to develop an open and expandable video analysis framework equipped with tools for analysing, recognising, extracting and classifying events in video, which can be used for searching during investigations with unpredictable characteristics, or exploring normative (or abnormal) behaviours, several efforts for standardising event representation from surveillance footage have been made [9, 10, 11, 22, 23, 28, 30, 37]. While various approaches have relied on offering foundational support for the domain ontology extension, to the best of our knowledge, a systematic ontology for standardising the event vocabulary for forensic analysis and an application of it has not been presented in the literature so far. In this paper, we present an OWL 2  ontology for the semantic retrieval of complex events to aid video surveillance-based vandalism detection.