If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."
However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …
We have developed a set of methods and tools for automatic discovery of putative regulatory signals in genome sequences. The analysis pipeline consists of gene expression data clustering, sequence pattern discovery from upstream sequences of genes, a control experiment for pattern significance threshold limit detection, selection of interesting patterns, grouping of these patterns, representing the pattern groups in a concise form and evaluating the discovered putative signals against existing databases of regulatory signals. The pattern discovery is computationally the most expensive and crucial step. Our tool performs a rapid exhaustive search for apriori unknown statistically significant sequence patterns of unrestricted length. The statistical significance is determined for a set of sequences in each cluster with respect to a set of background sequences allowing the detection of subtle regulatory signals specific for each cluster.
Tel: 81-6-6850-6601 Fax: 81-6-6850-6602 Keywords: alignment, metabolic pathway, pathway analysis, enzyme, EC number Abstract In many of the chemical reactions in living cells, enzymes act as catalysts in the conversion of certain compounds (substrates) into other compounds (products). Comparative analyses the metabolic pathways formed by such reactions give important information on their evolution and on pharmacological targets (Dandekar et al. 1999). Each of the enzymes that constitute a pathway is classified according to the EC (Enzyme Commission) numbering system, which consists of four sets of numbers that categorize the type of the chemical reaction catalyzed. In this study, we consider that reaction similarities can be expressed by the similarities between EC numbers of the respective enzymes. Therefore, in order to find a common pattern among pathways, it is desirable to be able to use the functional hierarchy of EC numbers to express the reaction similarities.
Tel: 31-152786424 Fax: 31-152781843 Keywords: Genetic Networks, Quasi-Linear Model, Clustering Abstract In this paper, the regulatory interactions between genes are modeled by a linear genetic network that is estimated from gene expression data. The inference of such a genetic network is hampered by the dimensionality problem. This problem is inherent in all gene expression data since the number of genes by far exceeds the number of measured time points. Consequently, there are infinitely many solutions that fit the data set perfectly. In this paper, this problem is tackled by combining genes with similar expression profiles in a single prototypical'gene'. Instead of modeling the genes individually, the relations between prototypical genes are modeled. In this way, genes that cannot be distinguished based on their expression profiles are grouped together and their common control action is modeled instead. This process reduces the number of signals and imposes a structure on the model that is supported by the fact that biological genetic networks are thought to be redundant and sparsely connected. In essence, the ambiguity in model solutions is represented explicitly by providing a generalized model that expresses the basic regulatory interactions between groups of similarly expressed genes. The modeling approach is illustrated on artificial as well as real data.
Both apply efficient structural pattern detection and graph theoretic techniques. The FlexProt algorithm simultaneously detects the hinge regions and aligns the rigid subparts of the molecules. It does it by cfficlently detecting maximal congruent rigid fragments in both molecules and calculating their optimal arrangement which does not violate the protein sequence order. The FlexMol algorithm is sequence order independent, yet requires as inpu the hypothesized hinge positions. Due its sequence order independence it can also be applied to proteln-protein interface matching and drug molecule alignment.
The development of DNA microarrays during the last few years (Schena et al. 1995; DeRisi, Iyer, & Brown 1997), allows researchers to simultaneously measure the expression levels of thousands of different genes. Experiments involving such arrays produce overwhelming amounts of data. In response, much recent work has been concerned with automating the analysis of microarray data.
Novel DNA microarray technologies (Eisen Brown 1999) enable the monitoring of expression levels of thousands of genes simultaneously. This allows for the first time a global view on the transcription levels of many (or all) genes when the cell undergoes specific conditions or processes. The potential of such technologies for functional genomics is tremendous: Measuring gene expression levels in different developmental stages, different body tissues, different clinical conditions and different organisms is instrumental in understanding genes function, gene networks, biological processes and effects of medical treatments. A key step in the analysis of gene expression data is the identification of groups of genes that manifest similar expression patterns over several conditions. The Copyright 2000, American Association for Artificial Intelligence corresponding algorithmic problem is to cluster multicondition gene expression patterns. The grouping of genes with similar expression patterns into clusters helps in unraveling relations between genes, deducing the function of genes and revealing the underlying gene regulatory network. A clustering problem consists of n elements and a characteristic vector for each element. In gene expression data, elements are genes, and the vector of each gene contains its expression levels under some conditions. These levels are obtained by measuring the intensity of hybridization of gene-specific oligonucleotides (or eDNA molecules), which are immobilized to a surface, to a labeled target RNA mixture (cf.