We present an implementation of a plan adaptation system, BioPlanner, built for biological pathway prediction across species. BioPlanner formulates a pathway discovery problem as a Hierarchical Task Network (HTN) planning problem and solves it by adapting a plan solution of another well-studied pathway. BioPlanner provides the following functionalities: It automatically builds HTN planning models for a biological pathway domain from the semantic web biological knowledge bases (KBs). It retrieves plan cases from the biological KBs. It generates hypothetical pathways using plan adaptation strategies with the aid of biological domain knowledge. It evaluates the hypothetical plan candidates, ranks them, and recommends the most likely hypotheses to users. It employs an information gathering multi-agent system to capture knowledge from heterogeneous sources to help the hypothetical plan generation process. We utilize BioPlanner to predict Signaling Transduction pathways for Mus musculus, Gallus gallus, and Drosophila melanogaster from Homo sapiens.
In the real world, there are some domain knowledge discovery problems that can be formulated into knowledge-based planning problems, such as chemical reaction process and biological pathway discovery problems (Khan et al. 2003). A view of these domain problems can be recast as a planning problem, such that initial and final states are known and processes can be captured as abstract operators that modify the environment. For example, approaching biological pathway discovery with an AI planning approach would mean that a valid plan that transfers the initial state into the goal state is a hypothetical pathway that prescribes the order of events that must occur to effect the goal state. We believe that AI planning technology can provide a modeling formalism for this task such that hypotheses can be generated, tested, queried and qualitatively simulated to improve the domain knowledge and rules. Our current approach is to build toward a general multiagent system for knowledge discovery (KDMAS) via planning.
The mathematical modeling of very large biological pathways presents serious computational challenges. Current techniques will no longer be adequate owing to their lack of scalability. Some of the resulting challenges are mathematical. Many of them arise from the knowledge representation, ambiguity, the qualitative nature the information, and the lack of enough information. This talk will describe the steps in the underlying modeling problem and the changes that occur as the pathways become large. There are computational challenges, but opportunities to incorporate artificial intelligence techniques into what has traditionally been a purely mathematical problem.
University of Mississippi and Thomas Jefferson University 2500 North State Street and 130 South 9th Sa'eet, Suite 400 Jackson, MS 39216 and Philadelphia, PA 19107 SENEX is a computer system under development to explore issues related to representation of molecular information, presentation of data, and reasoning with molecular information. It is written entirely in a portable programming environment supported by Common Lisp, the Common Lisp Object System (CLOS), and the Common Lisp Interface Manager (CLIM). SENEX contains information about molecules, molecular events and disease processes, and provides tools for reasoning with and displaying this information in useful ways. Molecular pathology is a discipline characterized by structures of variable complexity, events constrained by a variable number of factors, and incompletely understood phenomena. Representational issues inherent in the domain are complicated by the use of a language with a rigid/inflexible design. However, the CLOS metaobject protocol allows a programmer to adjust the design and implementation of the language to fit an application domain. Thus the first objective of the SENEX project is to exploit this feature of the CLOS metaobject protocol in designing a language tailored to the domain of molecular pathology.
External information propagates in the cell mainly through signaling cascades and transcriptional activation, allowing it to react to a wide spectrum of environmental changes. High throughput experiments identify numerous molecular components of such cascades that may, however, interact through unknown partners. Some of them may be detected using data coming from the integration of a protein-protein interaction network and mRNA expression profiles. This inference problem can be mapped onto the problem of finding appropriate optimal connected subgraphs of a network defined by these datasets. The optimization procedure turns out to be computationally intractable in general. Here we present a new distributed algorithm for this task, inspired from statistical physics, and apply this scheme to alpha factor and drug perturbations data in yeast. We identify the role of the COS8 protein, a member of a gene family of previously unknown function, and validate the results by genetic experiments. The algorithm we present is specially suited for very large datasets, can run in parallel, and can be adapted to other problems in systems biology. On renowned benchmarks it outperforms other algorithms in the field.