Oceania
Learning With Single-Teacher Multi-Student
You, Shan (Peking University) | Xu, Chang (University of Sydney) | Xu, Chao (Peking University) | Tao, Dacheng (University of Sydney)
In this paper we study a new learning problem defined as "Single-Teacher Multi-Student" (STMS) problem, which investigates how to learn a series of student (simple and specific) models from a single teacher (complex and universal) model. Taking the multiclass and binary classification for example, we focus on learning multiple binary classifiers from a single multiclass classifier, where each of binary classifier is responsible for a certain class. This actually derives from some realistic problems, such as identifying the suspect based on a comprehensive face recognition system. By treating the already-trained multiclass classifier as the teacher, and multiple binary classifiers as the students, we propose a gated support vector machine (gSVM) as a solution. A series of gSVMs are learned with the help of single teacher multiclass classifier. The teacher's help is two-fold; first, the teacher's score provides the gated values for students' decision; second, the teacher can guide the students to accommodate training examples with different difficulty degrees. Extensive experiments on real datasets validate its effectiveness.
Hybrid Attentive Answer Selection in CQA With Deep Users Modelling
Wen, Jiahui (The University of Queensland) | Ma, Jingwei (The University of Queensland) | Feng, Yiliu (National University of Defence Technology) | Zhong, Mingyang (The University of Queensland)
In this paper, we propose solutions to advance answer selection in Community Question Answering (CQA). Unlike previous works, we propose a hybrid attention mechanism to model question-answer pairs. Specifically, for each word, we calculate the intra-sentence attention indicating its local importance and the inter-sentence attention implying its importance to the counterpart sentence. The inter-sentence attention is based on the interactions between question-answer pairs, and the combination of these two attention mechanisms enables us to align the most informative parts in question-answer pairs for sentence matching. Additionally, we exploit user information for answer selection due to the fact that users are more likely to provide correct answers in their areas of expertise. We model users from their written answers to alleviate data sparsity problem, and then learn user representations according to the informative parts in sentences that are useful for question-answer matching task. This mean of modelling users can bridge the semantic gap between different users, as similar users may have the same way of wording their answers. The representations of users, questions and answers are learnt in an end-to-end neural network in a mean that best explains the interrelation between question-answer pairs. We validate the proposed model on a public dataset, and demonstrate its advantages over the baselines with thorough experiments.
Nonlocal Patch Based t-SVD for Image Inpainting: Algorithm and Error Analysis
Song, Liangchen (Wuhan University) | Du, Bo (Wuhan University) | Zhang, Lefei (Wuhan University) | Zhang, Liangpei (Wuhan University) | Wu, Jia (Wuhan University) | Li, Xuelong (Wuhan University)
In this paper, we propose a novel image inpainting framework consisting of an interpolation step and a low-rank tensor completion step. More specifically, we first initial the image with triangulation-based linear interpolation, and then we find similar patches for each missing-entry centered patch. Treating a group of patch matrices as a tensor, we employ the recently proposed effective t-SVD tensor completion algorithm with a warm start strategy to inpaint it. We observe that the interpolation step is such a rough initialization that the similar patch we found may not exactly match with the reference, so we name the problem as Patch Mismatch and analyse the error caused by it thoroughly. Our theoretical analysis shows that the error caused by Patch Mismatch can be decomposed into two components, one of which can be bounded by a reasonable assumption named local patch similarity, and another part is lower than that using matrix. Experiments on real images verify our method's superiority to the state-of-the-art inpainting methods.
On the Satisfiability Problem of Patterns in SPARQL 1.1
Zhang, Xiaowang (Tianjin University) | Bussche, Jan Van den (Hasselt University) | Wang, Kewen (Griffith University) | Wang, Zhe (Griffith University)
The pattern satisfiability is a fundamental problem for SPARQL. This paper provides a complete analysis of decidability/undecidability of satisfiability problems for SPARQL 1.1 patterns. A surprising result is the undecidability of satisfiability for SPARQL 1.1 patterns when only AND and MINUS are expressible. Also, it is shown that any fragment of SPARQL 1.1 without expressing both AND and MINUS is decidable. These results provide a guideline for future SPARQL query language design and implementation.
Measuring Conditional Independence by Independent Residuals: Theoretical Results and Application in Causal Discovery
Zhang, Hao (Fudan University) | Zhou, Shuigeng (Fudan University) | Guan, Jihong (Tongji University)
We investigate the relationship between conditional independence (CI) x โฅย y | Z and the independence of two residuals x โ E ( x | Z ) โฅย โE ( y | Z ), where x and y are two random variables, and Z is a set of random variables. We show that if x,ย y and Z are generated by following linear structural equation model and all external influences follow Gaussian distributions, then x โฅย y | Z if and only if x โ E ( x | Z )ย โฅ y โ E ( y | Z ). That is, the test of x โฅย y | Z can be relaxed to a simpler unconditional independence test of x โ E ( x | Z ) โฅย y โย E ( y | Z ). Furthermore, if all these external influences follow non-Gaussian distributions and the model satisfies structural faithfulness condition, then we have x โฅย y | Z โ x โย E ( x | Z ) โฅย y โย E ( y | Z ). We apply the results above to the causal discovery problem, where the causal directions are generally determined by a set of V -structures and their consistent propagations, so CI test-based methods can return a set of Markov equivalence classes. We show that in linear non-Gaussian context, x โย E ( x | Z ) โฅย y โ E ( y | Z ) โ x โ E ( x | Z ) โฅย z or y โ E ( y | Z โฅย z (โ z โย Z ) if Z is a minimal d -separator, which implies z causes x (or y ) if z directly connects to x (or y ). Therefore, we conclude that CIs have useful information for distinguishing Markov equivalence classes. In summary, compared with the existing discretization-based and kernel-based CI testing methods, the proposed method provides a simpler way to measure CI, which needs only one unconditional independence test and two regression operations. When being applied to causal discovery, it can find more causal relationships, which is experimentally validated.
Forgetting and Unfolding for Existential Rules
Wang, Zhe (Griffith University) | Wang, Kewen (Griffith University) | Zhang, Xiaowang (Tianjin University)
Existential rules, a family of expressive ontology languages, inherit desired expressive and reasoning properties from both description logics and logic programming. On the other hand, forgetting is a well studied operation for ontology reuse, obfuscation and analysis. Yet it is challenging to establish a theory of forgetting for existential rules. In this paper, we lay the foundation for a theory of forgetting for existential rules by developing a novel notion of unfolding. In particular, we introduce a definition of forgetting for existential rules in terms of query answering and provide a characterisation of forgetting by the unfolding. A result of forgetting may not be expressible in existential rules, and we then capture the expressibility of forgetting by a variant of boundedness. While the expressibility is undecidable in general, we identify a decidable fragment. Finally, we provide an algorithm for forgetting in this fragment.
A Recursive Scenario Decomposition Algorithm for Combinatorial Multistage Stochastic Optimisation Problems
Hemmi, David (Monash University) | Tack, Guido (Data61) | Wallace, Mark (Monash University)
Stochastic programming is concerned with decision making under uncertainty, seeking an optimal policy with respect to a set of possible future scenarios. This paper looks at multistage decision problems where the uncertainty is revealed over time. First, decisions are made with respect to all possible future scenarios. Secondly, after observing the random variables, a set of scenario specific decisions is taken. Our goal is to develop algorithms that can be used as a back-end solver for high-level modeling languages. In this paper we propose a scenario decomposition method to solve multistage stochastic combinatorial decision problems recursively. Our approach is applicable to general problem structures, utilizes standard solving technology and is highly parallelizable. We provide experimental results to show how it efficiently solves benchmarks with hundreds of scenarios.
Rank Maximal Equal Contribution: A Probabilistic Social Choice Function
Aziz, Haris (Data61, CSIRO and University of New South Wales) | Luo, Pang (Data61, CSIRO and University of New South Wales) | Rizkallah, Christine (University of Pennsylvania)
When aggregating preferences of agents via voting, two desirable goals are to incentivize agents to participate in the voting process and then identify outcomes that are Pareto efficient. We consider participation as formalized by Brandl, Brandt, and Hofbauer (2015) based on the stochastic dominance (SD) relation. We formulate a new rule called RMEC (Rank Maximal Equal Contribution) that is polynomial-time computable, ex post efficient and satisfies the strongest notion of participation. It also satisfies many other desirable fairness properties. The rule suggests a general approach to achieving very strong participation, ex post efficiency and fairness.
Cellular Network Traffic Scheduling With Deep Reinforcement Learning
Chinchali, Sandeep (Stanford University) | Hu, Pan (Stanford University) | Chu, Tianshu (Uhana, Inc. ) | Sharma, Manu (Uhana, Inc.) | Bansal, Manu (Uhana, Inc.) | Misra, Rakesh (Uhana, Inc.) | Pavone, Marco (Stanford University) | Katti, Sachin (Stanford University)
Modern mobile networks are facing unprecedented growth in demand due to a new class of traffic from Internet of Things (IoT) devices such as smart wearables and autonomous cars. Future networks must schedule delay-tolerant software updates, data backup, and other transfers from IoT devices while maintaining strict service guarantees for conventional real-time applications such as voice-calling and video. This problem is extremely challenging because conventional traffic is highly dynamic across space and time, so its performance is significantly impacted if all IoT traffic is scheduled immediately when it originates. In this paper, we present a reinforcement learning (RL) based scheduler that can dynamically adapt to traffic variation, and to various reward functions set by network operators, to optimally schedule IoT traffic. Using 4 weeks of real network data from downtown Melbourne, Australia spanning diverse traffic patterns, we demonstrate that our RL scheduler can enable mobile networks to carry 14.7% more data with minimal impact on existing traffic, and outpeforms heuristic schedulers by more than 2x. Our work is a valuable step towards designing autonomous, "self-driving" networks that learn to manage themselves from past data.
Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph
Saha, Amrita (IBM Research AI) | Pahuja, Vardaan (University of Montreal) | Khapra, Mitesh M. (IIT Madras) | Sankaranarayanan, Karthik (IBM Research AI) | Chandar, Sarath (University of Montreal)
While conversing with chatbots, humans typically tend to ask many questions, a significant portion of which can be answered by referring to large-scale knowledge graphs (KG). While Question Answering (QA) and dialog systems have been studied independently, there is a need to study them closely to evaluate such real-world scenarios faced by bots involving both these tasks. Towards this end, we introduce the task of Complex Sequential QA which combines the two tasks of (i) answering factual questions through complex inferencing over a realistic-sized KG of millions of entities, and (ii) learning to converse through a series of coherently linked QA pairs. Through a labor intensive semi-automatic process, involving in-house and crowdsourced workers, we created a dataset containing around 200K dialogs with a total of 1.6M turns. Further, unlike existing large scale QA datasets which contain simple questions that can be answered from a single tuple, the questions in our dialogs require a larger subgraph of the KG. Specifically, our dataset has questions which require logical, quantitative, and comparative reasoning as well as their combinations. This calls for models which can: (i) parse complex natural language questions, (ii) use conversation context to resolve coreferences and ellipsis in utterances, (iii) ask for clarifications for ambiguous queries, and finally (iv) retrieve relevant subgraphs of the KG to answer such questions. However, our experiments with a combination of state of the art dialog and QA models show that they clearly do not achieve the above objectives and are inadequate for dealing with such complex real world settings. We believe that this new dataset coupled with the limitations of existing models as reported in this paper should encourage further research in Complex Sequential QA.