Learning From What You Don't Observe

arXiv.org Artificial Intelligence

The process of diagnosis involves learning about the state of a system from various observations of symptoms or findings about the system. Sophisticated Bayesian (and other) algorithms have been developed to revise and maintain beliefs about the system as observations are made. Nonetheless, diagnostic models have tended to ignore some common sense reasoning exploited by human diagnosticians; In particular, one can learn from which observations have not been made, in the spirit of conversational implicature. There are two concepts that we describe to extract information from the observations not made. First, some symptoms, if present, are more likely to be reported before others. Second, most human diagnosticians and expert systems are economical in their data-gathering, searching first where they are more likely to find symptoms present. Thus, there is a desirable bias toward reporting symptoms that are present. We develop a simple model for these concepts that can significantly improve diagnostic inference.

Evidence Optimization Techniques for Estimating Stimulus-Response Functions

Neural Information Processing Systems

An essential step in understanding the function of sensory nervous systems isto characterize as accurately as possible the stimulus-response function (SRF) of the neurons that relay and process sensory information. Oneincreasingly common experimental approach is to present a rapidly varying complex stimulus to the animal while recording the responses ofone or more neurons, and then to directly estimate a functional transformation of the input that accounts for the neuronal firing. The estimation techniques usually employed, such as Wiener filtering or other correlation-based estimation of the Wiener or Volterra kernels, are equivalent to maximum likelihood estimation in a Gaussian-output-noise regression model. We explore the use of Bayesian evidence-optimization techniques to condition these estimates. We show that by learning hyperparameters thatcontrol the smoothness and sparsity of the transfer function it is possible to improve dramatically the quality of SRF estimates, as measured by their success in predicting responses to novel input.

Constraining Influence Diagram Structure by Generative Planning: An Application to the Optimization of Oil Spill Response

arXiv.org Artificial Intelligence

This paper works through the optimization of a real world planning problem, with a combination of a generative planning tool and an influence diagram solver. The problem is taken from an existing application in the domain of oil spill emergency response. The planning agent manages constraints that order sets of feasible equipment employment actions. This is mapped at an intermediate level of abstraction onto an influence diagram. In addition, the planner can apply a surveillance operator that determines observability of the state---the unknown trajectory of the oil. The uncertain world state and the objective function properties are part of the influence diagram structure, but not represented in the planning agent domain. By exploiting this structure under the constraints generated by the planning agent, the influence diagram solution complexity simplifies considerably, and an optimum solution to the employment problem based on the objective function is found. Finding this optimum is equivalent to the simultaneous evaluation of a range of plans. This result is an example of bounded optimality, within the limitations of this hybrid generative planner and influence diagram architecture.

Lp : A Logic for Statistical Information

arXiv.org Artificial Intelligence

This extended abstract presents a logic, called Lp, that is capable of representing and reasoning with a wide variety of both qualitative and quantitative statistical information. The advantage of this logical formalism is that it offers a declarative representation of statistical knowledge; knowledge represented in this manner can be used for a variety of reasoning tasks. The logic differs from previous work in probability logics in that it uses a probability distribution over the domain of discourse, whereas most previous work (e.g., Nilsson [2], Scott et al. [3], Gaifinan [4], Fagin et al. [5]) has investigated the attachment of probabilities to the sentences of the logic (also, see Halpern [6] and Bacchus [7] for further discussion of the differences). The logic Lp possesses some further important features. First, Lp is a superset of first order logic, hence it can represent ordinary logical assertions. This means that Lp provides a mechanism for integrating statistical information and reasoning about uncertainty into systems based solely on logic. Second, Lp possesses transparent semantics, based on sets and probabilities of those sets. Hence, knowledge represented in Lp can be understood in terms of the simple primative concepts of sets and probabilities. And finally, the there is a sound proof theory that has wide coverage (the proof theory is complete for certain classes of models). The proof theory captures a sufficient range of valid inferences to subsume most previous probabilistic uncertainty reasoning systems. For example, the linear constraints like those generated by Nilsson's probabilistic entailment [2] can be generated by the proof theory, and the Bayesian inference underlying belief nets [8] can be performed. In addition, the proof theory integrates quantitative and qualitative reasoning as well as statistical and logical reasoning. In the next section we briefly examine previous work in probability logics, comparing it to Lp. Then we present some of the varieties of statistical information that Lp is capable of expressing. After this we present, briefly, the syntax, semantics, and proof theory of the logic. We conclude with a few examples of knowledge representation and reasoning in Lp, pointing out the advantages of the declarative representation offered by Lp. We close with a brief discussion of probabilities as degrees of belief, indicating how such probabilities can be generated from statistical knowledge encoded in Lp. The reader who is interested in a more complete treatment should consult Bacchus [7].

A Measure-Free Approach to Conditioning

arXiv.org Artificial Intelligence

In an earlier paper, a new theory of measurefree "conditional" objects was presented. In this paper, emphasis is placed upon the motivation of the theory. The central part of this motivation is established through an example involving a knowledge-based system. In order to evaluate combination of evidence for this system, using observed data, auxiliary at tribute and diagnosis variables, and inference rules connecting them, one must first choose an appropriate algebraic logic description pair (ALDP): a formal language or syntax followed by a compatible logic or semantic evaluation (or model). Three common choices- for this highly non-unique choice - are briefly discussed, the logics being Classical Logic, Fuzzy Logic, and Probability Logic. In all three,the key operator representing implication for the inference rules is interpreted as the often-used disjunction of a negation (b => a) = (b'v a), for any events a,b. However, another reasonable interpretation of the implication operator is through the familiar form of probabilistic conditioning. But, it can be shown - quite surprisingly - that the ALDP corresponding to Probability Logic cannot be used as a rigorous basis for this interpretation! To fill this gap, a new ALDP is constructed consisting of "conditional objects", extending ordinary Probability Logic, and compatible with the desired conditional probability interpretation of inference rules. It is shown also that this choice of ALDP leads to feasible computations for the combination of evidence evaluation in the example. In addition, a number of basic properties of conditional objects and the resulting Conditional Probability Logic are given, including a characterization property and a developed calculus of relations.