Lenat, Douglas (Cycorp) | Witbrock, Michael (Cycorp) | Baxter, David (Cycorp) | Blackstone, Eugene (Cleveland Clinic Foundation) | Deaton, Chris (Cycorp) | Schneider, Dave (Cycorp) | Scott, Jerry (Research Intelligence) | Shepard, Blake (Cycorp)
By extending Cyc’s ontology and KB approximately 2%, Cycorp and Cleveland Clinic Foundation (CCF) have built a system to answer clinical researchers’ ad hoc queries. The query may be long and complex, hence only partially understood at first, parsed into a set of CycL (higher-order logic) fragments with open variables. But, surprisingly often, after applying various constraints (medical domain knowledge, common sense, discourse pragmatics, syntax), there is only one single way to fit those fragments together, one semantically meaningful formal query P. The system, SRA (for Semantic Research Assistant), dispatches a series of database calls and then combines, logically and arithmetically, their results into answers to P. Seeing the first few answers stream back, the user may realize that they need to abort, modify, and re-ask their query. Even before they push ASK, just knowing approximately how many answers would be returned can spark such editing. Besides real-time ad hoc query-answering, queries can be bundled and persist over time. One bundle of 275 queries is rerun quarterly by CCF to produce the procedures and outcomes data it needs to report to STS (Society of Thoracic Surgeons, an external hospital accreditation and ranking body); another bundle covers ACC (American College of Cardiology) reporting. Until full articulation/answering of precise, analytical queries becomes as straight-forward and ubiquitous as text search, even partial understanding of a query empowers semantic search over semi-structured data (ontology-tagged text), avoiding many of the false positives and false negatives that standard text searching suffers from.