Goto

Collaborating Authors

 University of Delaware


OCR-Based Image Features for Biomedical Image and Article Classification: Identifying Documents Relevant to Genomic Cis-Regulatory Elements

AAAI Conferences

Images form a significant, yet under-utilized, information source in published biomedical articles. Much current work on biomedical image retrieval and classification uses simple, standard image representation employing features such as edge direction or gray scale histograms. In our earlier work we have used such features as well to classify images, where image-class-tags have been used to represent and classify complete articles. Here we focus on a different literature classification task: identifying articles discussing cis-regulatory elements and modules, motivated by the need to understand complex gene-networks. Curators attempting to identify such articles use as a major cue a certain type of image in which the conserved cis-regulatory region on the DNA is shown. Our experiments show that automatically identifying such images using common image features (such as gray scale) is highly error prone. However, using Optical Character Recognition (OCR) to extract alphabet characters from images, calculating character distribution and using the distribution parameters as image features, forms a novel image representation, which allows us to identify DNA-content in images with high precision and recall (over 0.9). Utilizing the occurrence of DNA-rich images within articles, we train a classifier to identify articles pertaining to cis-regulatory elements with a similarly high precision and recall. Using OCR-based image features has much potential beyond the current task, to identify other types of biomedical sequence-based images showing DNA, RNA and proteins. Moreover, automatically identifying such images is applicable beyond the current use-case, in other important biomedical document classification tasks.


Visual and Spatial Factors in a Bayesian Reasoning Framework for the Recognition of Intended Messages in Grouped Bar Charts

AAAI Conferences

The overall goal of our research is the automatic recognition of the intended message of a grouped bar chart. This paper presents our preliminary work on a system that utilizes the communicative signals in a grouped bar chart as evidence in a Bayesian network that hypothesizes the primary message conveyed by the graphic. The paper discusses the kinds of communicative signals present in grouped bar charts and an ACT-R model for computationalizing one important communicative signal, the relative effort involved in performing the perceptual tasks necessary for the recognition. It also describes our Bayesian network and its implementation on a subset of the kinds of messages that can be conveyed by grouped bar charts.


BioPlanner: A Plan Adaptation Approach for the Discovery of Biological Pathways across Species

AAAI Conferences

We present an implementation of a plan adaptation system, BioPlanner, built for biological pathway prediction across species. BioPlanner formulates a pathway discovery problem as a Hierarchical Task Network (HTN) planning problem and solves it by adapting a plan solution of another well-studied pathway. BioPlanner provides the following functionalities: It automatically builds HTN planning models for a biological pathway domain from the semantic web biological knowledge bases (KBs). It retrieves plan cases from the biological KBs. It generates hypothetical pathways using plan adaptation strategies with the aid of biological domain knowledge. It evaluates the hypothetical plan candidates, ranks them, and recommends the most likely hypotheses to users. It employs an information gathering multi-agent system to capture knowledge from heterogeneous sources to help the hypothetical plan generation process. We utilize BioPlanner to predict Signaling Transduction pathways for Mus musculus, Gallus gallus, and Drosophila melanogaster from Homo sapiens.