Searching for Common Sense: Populating Cyc from the Web

AAAI Conferences

The Cyc project is predicated on the idea that effective machine learning depends on having a core of knowledge that provides a context for novel learned information - what is known informally as "common sense." Over the last twenty years, a sufficient core of common sense knowledge has been entered into Cyc to allow it to begin effectively and flexibly supporting its most important task: increasing its own store of world knowledge. In this paper, we present initial work on a method of using a combination of Cyc and the World Wide Web, accessed via Google, to assist in entering knowledge into Cyc. The long-term goal is automating the process of building a consistent, formalized representation of the world in the Cyc knowledge base via machine learning. We present preliminary results of this work and describe how we expect the knowledge acquisition process to become more accurate, faster, and more automated in the future.


Automated Population of Cyc: Extracting Information about Namedentities from the Web†

AAAI Conferences

Populating the Cyc Knowledge Base (KB) has been a manual process until very recently. However, there is currently enough knowledge in Cyc for it to be feasible to attempt to acquire additional knowledge autonomously. This paper describes a system that can collect and validate formally represented, fully-integrated knowledge from the Web or any other electronically available text corpus, about various entities of interest (e.g.


Knowledge Formation and Dialogue Using the KRAKEN Toolset

AAAI Conferences

The KRAKEN toolset is a comprehensive interface for knowledge acquisition that operates in conjunction with the Cyc knowledge base. The KRAKEN system is designed to allow subject-matter experts to make meaningful additions to an existing knowledge base, without the benefit of training in the areas of artificial intelligence, ontology development, or logical representation.


Knowledge Formation and Dialogue Using the KRAKEN Toolset

AAAI Conferences

The KRAKEN toolset is a comprehensive interface for knowledge acquisition that operates in conjunction with the Cyc knowledge base. The KRAKEN system is designed to allow subject-matter experts to make meaningful additions to an existing knowledge base, without the benefit of training in the areas of artificial intelligence, ontology development, or logical representation.


Harnessing Cyc to Answer Clinical Researchers ' Ad Hoc Queries

AI Magazine

By extending Cyc's ontology and knowledge base approximately 2 percent, Cycorp and Cleveland Clinic Foundation (CCF) have built a system to answer clinical researchers' ad hoc queries. The query may be long and complex, hence it is only partially understood at first, parsed into a set of CycL (higher-order logic) fragments with open variables. But, surprisingly often, after applying various constraints (medical domain knowledge, common sense, discourse pragmatics, syntax), there is only one single way to fit those fragments together, one semantically meaningful formal query P. The Semantic Research Assistant (SRA) system dispatches a series of database calls and then combines, logically and arithmetically, their results into answers to P. Seeing the first few answers stream back, users may realize that they need to abort, modify, and re-ask their query. Even before they push ASK, just knowing approximately how many answers would be returned can spark such editing. Besides real-time ad hoc query answering, queries can be bundled and persist over time.