Decision trees are widely used in machine learning and knowledge acquisition systems. However, there is no optimal or even unanimously accepted strategy of obtaining "good" such trees, and most of the generated trees suffer from improprieties, i.e. inadequacies in representing knowledge. The final goal of the research reported here is to formulate a theory for the decision trees domain, that is a set of heuristics (on which a majority of experts will agree) which will describe a good decision tree, as well as a set of heuristics specifying how to obtain optimal trees. In order to achieve this goal we have designed a recursive architecture learning system, which monitors an interactive knowledge acquisition system based on decision trees and driven by explanatory reasoning, and incrementally acquires from the experts using it the knowledge used to build the decision trees domain theory. This theory is also represented as a set of decision trees, and may be domain dependent. Our system acquires knowledge to define the notion of good/bad decision trees and to measure their quality, as well as knowledge needed to guide domain experts in constructing good decision trees. The partial theory acquired at each moment is also used by the basic knowledge acquisition system in its tree generation process, thus constantly improving its performance.
We first investigated learning with an emphasis on the acquisition of linguistic knowledge and then we considered a more general theory of "learnability". We found that critical issues arising in problems of language-learning are issues that must be confronted when many other learning problems are to be solved. These include devising techniques for representing and conveying knowledge; selecting illustrative examples and processing observed data; and establishing criteria for acceptance of a learned result. Each of these issues may be affected by the domain in vhich the body of knowledge lies, and the environment in which the "learner" resides. In fact, through our language-learning research we have concluded that such grounding can define a learning technique, as well as defining the resultant model with which specified knowledge may be acquired. We now describe the techniques we developed, and the results we obtained, in the specific area of formal language learning. Then we discuss the potential applicability of our techniques to all processes of learning.
Capturing domain knowledge can be a time-consuming process that typically requires the collaboration of a Subject Matter Expert and a modeling expert to encode the knowledge. In a number of domains and applications, this situation is further exacerbated by the fact that the Subject Matter Expert may find it difficult to articulate the domain knowledge as a procedure or rules, but instead may find it easier to classify instance data. To facilitate this type of knowledge elicitation from Subject Matter Experts, we have developed a system that automatically generates formal and executable rules from provided labeled instance data. We do this by leveraging the techniques of Inductive Logic Programming (ILP) to generate Horn clause based rules to separate out positive and negative instance data. We illustrate our approach on a Design For Manufacturability (DFM) platform where the goal is to design products that are easy to manufacture by providing early manufacturability feedback. Specifically we show how our approach can be used to generate feature recognition rules from positive and negative instance data supplied by Subject Matter Experts. Our platform is interactive, provides visual feedback and is iterative. The feature identification rules generated can be inspected, manually refined and vetted.
The capabilities for representing and reasoning about three-dimensional (3-D) objects are essential for knowledge-based, 3-D photointerpretation systems that combine domain knowledge with image processing, as demonstrated by 3- D Mosaic and ACRONYM. A practical framework for geometric representation and reasoning must incorporate projections between a two-dimensional (2-D) image and a 3-D scene, shape and surface properties of objects, and geometric and topological relationships between objects. In addition, it should allow easy modification and extension of the system's domain knowledge and be flexible enough to organize its reasoning efficiently to take advantage of the current available knowledge. This system uses frames to represent objects such as buildings and walls, geometric features such as lines and planes, and geometric relationships such as parallel lines.