Microarray is one of the essential technologies used by the biologist to measure genome-wide expression levels of genes in a particular organism under some particular conditions or stimuli. As microarrays technologies have become more prevalent, the challenges of analyzing these data for getting better insight about biological processes have essentially increased. Due to availability of artificial intelligence based sophisticated computational techniques, such as artificial neural networks, fuzzy logic, genetic algorithms, and many other nature-inspired algorithms, it is possible to analyse microarray gene expression data in more better way. Here, we reviewed artificial intelligence based techniques for the analysis of microarray gene expression data. Further, challenges in the field and future work direction have also been suggested.
Gene-expression microarrays, commonly called gene chips, make it possible to simultaneously measure the rate at which a cell or tissue is expressing -- translating into a protein -- each of its thousands of genes. One can use these comprehensive snapshots of biological activity to infer regulatory pathways in cells; identify novel targets for drug design; and improve the diagnosis, prognosis, and treatment planning for those suffering from disease. However, the amount of data this new technology produces is more than one can manually analyze. Hence, the need for automated analysis of microarray data offers an opportunity for machine learning to have a significant impact on biology and medicine. This article describes microarray technology, the data it produces, and the types of machine learning tasks that naturally arise with these data. It also reviews some of the recent prominent applications of machine learning to gene-chip data, points to related tasks where machine learning might have a further impact on biology and medicine, and describes additional types of interesting data that recent advances in biotechnology allow biomedical researchers to collect.
New feature selection algorithms for linear threshold functions are described whichcombine backward elimination with an adaptive regularization method. This makes them particularly suitable to the classification of microarray expression data, where the goal is to obtain accurate rules depending on few genes only. Our algorithms are fast and easy to implement, since they center on an incremental (large margin) algorithm which allows us to avoid linear, quadratic or higher-order programming methods. We report on preliminary experiments with five known DNA microarray datasets. These experiments suggest that multiplicative large margin algorithms tend to outperform additive algorithms (such as SVM) on feature selection tasks.
Microarray experiments are capable of measuring the expression level of thousands of genes simultaneously. Dealing with this enormous amount of information requires complex computation. Support Vector Machines (SVM) have been widely used with great efficiency to solve classification problems that have high dimension. In this sense, it is plausible to develop new feature selection strategies for microarray data that are associated with this type of classifier. Therefore, we propose, in this paper, a new method for feature selection based on an ordered search process to explore the space of possible subsets.
We will demonstrate the GEMS system for automated development and evaluation of high-quality cancer diagnostic models and biomarker discovery from microarray gene expression data. The development of GEMS was informed by the results of an extensive algorithmic evaluation using 11 microarray datasets. The system was further evaluated in two cross-dataset applications and using 5 microarray datasets. The performance of models produced by GEMS is comparable or better than the results obtained by human analysts, and these models generalize well to independent samples in cross-dataset applications. The system is freely available for download from http://www.gems-system.org