SS99-01-023.pdf

AAAI Conferences

Rule induction methods axe classified into two categories, induction of deterministic rules and probabilistic ones(Michalski 1986; Pawlak 1991; Tsumoto and Tanaka 1996). While deterministic rules are supported by positive examples, probabilistic ones are supported by large positive examples and small negative samples. That is, both kinds of rules select positively one decision if a case satisfies their conditional parts. However, domain experts do not use only positive reasoning but also negative reasoning, since a domain is not always deterministic. For example, when a patient does not have a headache, migraine should not be suspected: negative reasoning plays an important role in cutting the search space of a differential diagnosis(Tsumoto and Tanaka 1996). 1 Therefore, negative rules should be induced from databases in order to induce rules which will be easier for domain experts to 1The essential point is that if extracted patterns do not reflect experts' reasoning process, domain experts have difficulties in interpreting them. Without interpretation of domain experts, a discovery procedure would not proceed, which also means that the interaction between human experts and computers is indispensable to computer-assisted discovery.


Learning and using relational theories

Neural Information Processing Systems

Much of human knowledge is organized into sophisticated systems that are often called intuitive theories. We propose that intuitive theories are mentally represented ina logical language, and that the subjective complexity of a theory is determined by the length of its representation in this language. This complexity measure helps to explain how theories are learned from relational data, and how they support inductive inferences about unobserved relations. We describe two experiments that test our approach, and show that it provides a better account of human learning and reasoning than an approach developed by Goodman [1]. What is a theory, and what makes one theory better than another?


Evidence and Belief

AAAI Conferences

We discuss the representation of knowledge and of belief from the viewpoint of decision theory. While the Bayesian approach enjoys general-purpose applicability and axiomatic foundations, it suffers from several drawbacks. In particular, it does not model the belief formation process, and does not relate beliefs to evidence. We survey alternative approaches, and focus on formal model of casebased prediction and case-based decisions. A formal model of belief and knowledge representation needs to address several questions. The most basic ones are: (i) how do we represent knowledge?


Making the positive case for artificial intelligence - CBR

#artificialintelligence

In part, the critics of AI are driven by the knowledge that'white collar jobs' are the ones that are now under threat. Business leaders are frequently confronted by notions of job-killing automation and headlines on the variation of the theme that "Robots Will Steal Our Jobs." Elon Musk, CEO of Tesla, Silicon Valley figurehead, and champion of technology-driven innovation even goes a step further by suggesting AI is a fundamental threat to human civilisation. The robot on the assembly line is now a familiar image. AI in middle management is new.


Flexible Models for Microclustering with Application to Entity Resolution

arXiv.org Machine Learning

Most generative models for clustering implicitly assume that the number of data points in each cluster grows linearly with the total number of data points. Finite mixture models, Dirichlet process mixture models, and Pitman--Yor process mixture models make this assumption, as do all other infinitely exchangeable clustering models. However, for some applications, this assumption is inappropriate. For example, when performing entity resolution, the size of each cluster should be unrelated to the size of the data set, and each cluster should contain a negligible fraction of the total number of data points. These applications require models that yield clusters whose sizes grow sublinearly with the size of the data set. We address this requirement by defining the microclustering property and introducing a new class of models that can exhibit this property. We compare models within this class to two commonly used clustering models using four entity-resolution data sets.