Goto

Collaborating Authors

 IBM Watson Research Center


Designing Fast Absorbing Markov Chains

AAAI Conferences

Markov Chains are a fundamental tool for the analysis of real world phenomena and randomized algorithms. Given a graph with some specified sink nodes and an initial probability distribution,we consider the problem of designing an absorbing Markov Chain that minimizes the time required to reach a sink node, by selecting transition probabilities subject to some natural regularity constraints. By exploiting the Markovian structure, we obtain closed form expressions for the objective function as well as its gradient, which can be thus evaluated efficiently without any simulation of the underlying process and fed to a gradient-based optimization package. For the special case of designing reversible Markov Chains, we show that global optimum can be efficiently computed by exploiting convexity. We demonstrate how our method can be used for the evaluation and design of local search methods tailored for certain domains.


Non-Restarting SAT Solvers with Simple Preprocessing Can Efficiently Simulate Resolution

AAAI Conferences

Propositional satisfiability (SAT) solvers based on conflict directed clause learning (CDCL) implicitly produce resolution refutations of unsatisfiable formulas. The precise class of formulas for which they can produce polynomial size refutations has been the subject of several studies, with special focus on the clause learning aspect of these solvers. The results, however, assume the use of non-standard and non-asserting learning schemes, or rely on polynomially many restarts for simulating individual steps of a resolution refutation, or work with a theoretical model that significantly deviates from certain key aspects of all modern CDCL solvers such as learning only one asserting clause from each conflict and other techniques such as conflict guided backjumping and phase saving. We study non-restarting CDCL solvers that learn only one asserting clause per conflict and show that, with simple preprocessing that depends only on the number of variables of the input formula, such solvers can polynomially simulate resolution. We show, moreover, that this preprocessing allows one to convert any CDCL solver to one that is non-restarting.


MaxSAT by Improved Instance-Specific Algorithm Configuration

AAAI Conferences

We show how both techniques can be combined MaxSAT is the optimization version of the Satisfiability and empirically demonstrate on SAT that our improved (SAT) problem. It can be used effectively to model problems method works notably better than the original method and in several domains, such as scheduling, timetabling, other instance-specific algorithm tuners. We then apply the FPGA routing, design and circuit debugging, software package new technique to MaxSAT. Finally, in extensive experiments installation, bioinformatics, probabilistic reasoning, etc. we show that the developed solvers significantly outperform From the research perspective, MaxSAT is also of particular the current state-of-the-art in every MaxSAT domain.


Resolution and Parallelizability: Barriers to the Efficient Parallelization of SAT Solvers

AAAI Conferences

Recent attempts to create versions of Satisfiability (SAT) solversthat exploit parallel hardware and information sharing have met withlimited success. In fact,the most successful parallel solvers in recent competitions were basedon portfolio approaches with little to no exchange of informationbetween processors. This experience contradicts the apparentparallelizability of exploring a combinatorial search space. Wepresent evidence that this discrepancy can be explained by studyingSAT solvers through a proof complexity lens, as resolution refutationengines. Starting with theobservation that a recently studied measure of resolution proofs,namely depth, provides a (weak) upper bound to the best possiblespeedup achievable by such solvers, we empirically show the existenceof bottlenecks to parallelizability that resolution proofs typicallygenerated by SAT solvers exhibit. Further, we propose a new measureof parallelizability based on the best-case makespan of an offlineresource constrained scheduling problem. This measureexplicitly accounts for a bounded number of parallel processors andappears to empirically correlate with parallel speedups observed inpractice. Our findings suggest that efficient parallelization of SATsolvers is not simply a matter of designing the right clause sharingheuristics; even in the best case, it can be --- and indeed is ---hindered by the structure of the resolution proofs current SAT solverstypically produce.


Many Bills: Visualizing the Anatomy of Congressional Legislation

AAAI Conferences

US Federal Legislation is a common subject of discussion and advocacy on the web. The contents of bills present a significant challenge to both experts and average citizens due to their length and complex legal language. To make bills more accessible to the general public, we present Many Bills: a web-based visualization prototype that reveals the underlying semantics of a bill. We classify the sections of a bill into topics and visualize them using different colors. Further, using information retrieval techniques, we locate sections that don't seem to fit with the overall topic of the bill. To highlight outliers in our `misfit mode', we visualize them in red, which builds a contrast against the remaining gray sections. Both topic and misfit visualizations provide an overview and detail view of bills, enabling users to read individual sections of a bill and compare topic patterns across multiple bills. We obtained initial user feedback and continue collecting label corrections from users through the interface.


Using Part-Of Relations for Discovering Causality

AAAI Conferences

Historically, causal markers, syntactic structures and connectives have been the sole identifying features for automatically extracting causal relations in natural language discourse. However various connectives such as “and,” prepositions such as “as” and other syntactic structures are highly ambiguous in nature, and it is clear that one cannot solely rely on lexico-syntactic markers for detection of causal phenomenon in discourse. This paper introduces the theory of granularity and describes different approaches to identify granularity in natural language. As causality is often granular in nature, we use granularity relations to discover and infer the presence of causal relations in text. We compare this with causal relations identified using just causal markers. We achieve a precision of 0.91 and a recall of 0.79 using granularity for causal relation detection, as compared to a precision of 0.79 and a recall of 0.44 using pure causal markers for causality detection.


Natural Language Aided Visual Query Building for Complex Data Access

AAAI Conferences

Over the past decades, there have been significant efforts on developing robust and easy-to-use query interfaces to databases. So far, the typical query interfaces are GUI-based visual query interfaces. Visual query interfaces however, have limitations especially when they are used for accessing large and complex datasets. Therefore, we are developing a novel query interface where users can use natural language expressions to help author visual queries. Our work enhances the usability of a visual query interface by directly addressing the "knowledge gap" issue in visual query interfaces. We have applied our work in several real-world applications. Our preliminary evaluation demonstrates the effectiveness of our approach.