Goto

Collaborating Authors


Thinking in SQL

#artificialintelligence

Thinking in SQL will help you understand what SQL queries do, how to create and change them, and how to create reports from the data you obtain. If you want to know how to write better SQL queries, you have to change the way you think about the code New What you'll learn There are so many SQL courses to choose from, but how many really teach you the nitty gritty of how SQL works under-the-hood? Learn how to write SQL properly by "thinking like a relational database": No software downloading required - you can even do our course on an iPad or your work computer! Besides covering a broad range of topics, my approach to teaching SQL is radically different from many other courses. Instead of taking a cookbook approach, this course teaches you how to "think like a database," meaning understanding the logical ordering of how SQL statements are processed.


Scalable Computation of High-Order Optimization Queries

Communications of the ACM

Constrained optimization problems are at the heart of significant applications in a broad range of domains, including finance, transportation, manufacturing, and healthcare. Modeling and solving these problems has relied on application-specific solutions, which are often complex, error-prone, and do not generalize. Our goal is to create a domain-independent, declarative approach, supported and powered by the system where the data relevant to these problems typically resides: the database. We present a complete system that supports package queries, a new query model that extends traditional database queries to handle complex constraints and preferences over answer sets, allowing the declarative specification and efficient evaluation of a significant class of constrained optimization problems--integer linear programs (ILP)--within a database. Traditional database queries follow a simple model: they define constraints, in the form of selection predicates, that each tuple in the result must satisfy.


Harnessing Cyc to Answer Clinical Researchers' Ad Hoc Queries

AI Magazine

By extending Cyc's ontology and KB approximately 2%, Cycorp and Cleveland Clinic Foundation (CCF) have built a system to answer clinical researchers' ad hoc queries. But, surprisingly often, after applying various constraints (medical domain knowledge, common sense, discourse pragmatics, syntax), there is only one single way to fit those fragments together, one semantically meaningful formal query P. The system, SRA (for Semantic Research Assistant), dispatches a series of database calls and then combines, logically and arithmetically, their results into answers to P. Seeing the first few answers stream back, the user may realize that they need to abort, modify, and re-ask their query. Besides real-time ad hoc query-answering, queries can be bundled and persist over time. Until full articulation/answering of precise, analytical queries becomes as straight-forward and ubiquitous as text search, even partial understanding of a query empowers semantic search over semi-structured data (ontology-tagged text), avoiding many of the false positives and false negatives that standard text searching suffers from.


Crowdsourced Clustering: Querying Edges vs Triangles

Neural Information Processing Systems

We consider the task of clustering items using answers from non-expert crowd workers. In such cases, the workers are often not able to label the items directly, however, it is reasonable to assume that they can compare items and judge whether they are similar or not. An important question is what queries to make, and we compare two types: random edge queries, where a pair of items is revealed, and random triangles, where a triple is. Since it is far too expensive to query all possible edges and/or triangles, we need to work with partial observations subject to a fixed query budget constraint. When a generative model for the data is available (and we consider a few of these) we determine the cost of a query by its entropy; when such models do not exist we use the average response time per query of the workers as a surrogate for the cost.