Goto

Collaborating Authors

 microarray


Researchers develop a pimple patch that actually seems to work

Popular Science

Breakthroughs, discoveries, and DIY tips sent every weekday. Most of us have heard of pimple patches, those circular bandage-like patches to clear up a zit . More importantly, many of us have heard of pimple patches--the ones that cost a fortune, stick to your face for maybe 30 minutes, and do absolutely nothing for that angry whitehead you woke up with this morning. However, the team behind a small study published in the journal, claim to have developed a two-step pimple patch that actually works. How can they claim this?


Automatically Score Tissue Images Like a Pathologist by Transfer Learning

arXiv.org Artificial Intelligence

Cancer is the second leading cause of death in the world. Diagnosing cancer early on can save many lives. Pathologists have to look at tissue microarray (TMA) images manually to identify tumors, which can be time-consuming, inconsistent and subjective. Existing automatic algorithms either have not achieved the accuracy level of a pathologist or require substantial human involvements. A major challenge is that TMA images with different shapes, sizes, and locations can have the same score. Learning staining patterns in TMA images requires a huge number of images, which are severely limited due to privacy and regulation concerns in medical organizations. TMA images from different cancer types may share certain common characteristics, but combining them directly harms the accuracy due to heterogeneity in their staining patterns. Transfer learning is an emerging learning paradigm that allows borrowing strength from similar problems. However, existing approaches typically require a large sample from similar learning problems, while TMA images of different cancer types are often available in small sample size and further existing algorithms are limited to transfer learning from one similar problem. We propose a new transfer learning algorithm that could learn from multiple related problems, where each problem has a small sample and can have a substantially different distribution from the original one. The proposed algorithm has made it possible to break the critical accuracy barrier (the 75% accuracy level of pathologists), with a reported accuracy of 75.9% on breast cancer TMA images from the Stanford Tissue Microarray Database. It is supported by recent developments in transfer learning theory and empirical evidence in clustering technology. This will allow pathologists to confidently adopt automatic algorithms in recognizing tumors consistently with a higher accuracy in real time.


Orthogonal Non-negative Matrix Factorization: a Maximum-Entropy-Principle Approach

arXiv.org Artificial Intelligence

In this paper, we introduce a new methodology to solve the orthogonal nonnegative matrix factorization (ONMF) problem, where the objective is to approximate an input data matrix by a product of two nonnegative matrices, the features matrix and the mixing matrix, where one of them is orthogonal. We show how the ONMF can be interpreted as a specific facility-location problem (FLP), and adapt a maximum-entropy-principle based solution for FLP to the ONMF problem. The proposed approach guarantees orthogonality and sparsity of the features or the mixing matrix, while ensuring nonnegativity of both. Additionally, our methodology develops a quantitative characterization of ``true" number of underlying features - a hyperparameter required for the ONMF. An evaluation of the proposed method conducted on synthetic datasets, as well as a standard genetic microarray dataset indicates significantly better sparsity, orthogonality, and performance speed compared to similar methods in the literature, with comparable or improved reconstruction errors.


Learning Low-dimensional Manifolds for Scoring of Tissue Microarray Images

arXiv.org Artificial Intelligence

Tissue microarray (TMA) images have emerged as an important high-throughput tool for cancer study and the validation of biomarkers. Efforts have been dedicated to further improve the accuracy of TACOMA, a cutting-edge automatic scoring algorithm for TMA images. One major advance is due to deepTacoma, an algorithm that incorporates suitable deep representations of a group nature. Inspired by the recent advance in semi-supervised learning and deep learning, we propose mfTacoma to learn alternative deep representations in the context of TMA image scoring. In particular, mfTacoma learns the low-dimensional manifolds, a common latent structure in high dimensional data. Deep representation learning and manifold learning typically requires large data. By encoding deep representation of the manifolds as regularizing features, mfTacoma effectively leverages the manifold information that is potentially crude due to small data. Our experiments show that deep features by manifolds outperforms two alternatives -- deep features by linear manifolds with principal component analysis or by leveraging the group property.


Machine Learning in Science: Interpreting Gene Regulation

#artificialintelligence

Pretty much every cell in the body of a life form has the same DNA. Genes are bits of this DNA that code for proteins or (less commonly) other huge biomolecules. A gene is communicated through a two-step procedure wherein the geneรญs DNA is first deciphered into RNA, which is then converted into the corresponding protein. An epic innovation of gene-expression microarraysรฑ whose advancement began in the second half of the 1990รญs and is revolutionarily affecting molecular biology and permits one to screen the DNA-to-RNA part of this major biological procedure. While the ability to gauge transcription of a single gene isn't new, the ability to quantify without a moment's delay the transcription of the considerable number of genes in a living being is new.


Using Machine Learning to Design and Interpret Gene-Expression Microarrays

AI Magazine

Gene-expression microarrays, commonly called gene chips, make it possible to simultaneously measure the rate at which a cell or tissue is expressing--translating into a protein--each of its thousands of genes. One can use these comprehensive snapshots of biological activity to infer regulatory pathways in cells; identify novel targets for drug design; and improve the diagnosis, prognosis, and treatment planning for those suffering from disease. However, the amount of data this new technology produces is more than one can manually analyze. Hence, the need for automated analysis of microarray data offers an opportunity for machine learning to have a significant impact on biology and medicine. This article describes microarray technology, the data it produces, and the types of machine learning tasks that naturally arise with these data.


The race to computerise biology

AITopics Original Links

FOR centuries, biology has been an empirical field that featured mostly specimens and Petri dishes. Over the past five years, however, computers have changed the discipline--as they have harnessed the data on genetics for the pursuit of cures for disease. Wet lab processes that took weeks to complete are giving way to digital research done in silico. Notebooks with jotted comments, measurements and drawings have yielded to terabyte storehouses of genetic and chemical data. And empirical estimates are being replaced by mathematical exactness.


Inference with Transposable Data: Modeling the Effects of Row and Column Correlations

arXiv.org Machine Learning

We consider the problem of large-scale inference on the row or column variables of data in the form of a matrix. Often this data is transposable, meaning that both the row variables and column variables are of potential interest. An example of this scenario is detecting significant genes in microarrays when the samples or arrays may be dependent due to underlying relationships. We study the effect of both row and column correlations on commonly used test-statistics, null distributions, and multiple testing procedures, by explicitly modeling the covariances with the matrix-variate normal distribution. Using this model, we give both theoretical and simulation results revealing the problems associated with using standard statistical methodology on transposable data. We solve these problems by estimating the row and column covariances simultaneously, with transposable regularized covariance models, and de-correlating or sphering the data as a pre-processing step. Under reasonable assumptions, our method gives test statistics that follow the scaled theoretical null distribution and are approximately independent. Simulations based on various models with structured and observed covariances from real microarray data reveal that our method offers substantial improvements in two areas: 1) increased statistical power and 2) correct estimation of false discovery rates.


Using Machine Learning to Design and Interpret Gene-Expression Microarrays

AI Magazine

However, the amount of data this new technology produces is more than one can manually analyze. Hence, the need for automated analysis of microarray data offers an opportunity for machine learning to have a significant impact on biology and medicine. This article describes microarray technology, the data it produces, and the types of machine learning tasks that naturally arise with these data. It also reviews some of the recent prominent applications of machine learning to gene-chip data, points to related tasks where machine learning might have a further impact on biology and medicine, and describes additional types of interesting data that recent advances in biotechnology allow biomedical researchers to collect.


Using Machine Learning to Design and Interpret Gene-Expression Microarrays

AI Magazine

Gene-expression microarrays, commonly called gene chips, make it possible to simultaneously measure the rate at which a cell or tissue is expressing -- translating into a protein -- each of its thousands of genes. One can use these comprehensive snapshots of biological activity to infer regulatory pathways in cells; identify novel targets for drug design; and improve the diagnosis, prognosis, and treatment planning for those suffering from disease. However, the amount of data this new technology produces is more than one can manually analyze. Hence, the need for automated analysis of microarray data offers an opportunity for machine learning to have a significant impact on biology and medicine. This article describes microarray technology, the data it produces, and the types of machine learning tasks that naturally arise with these data. It also reviews some of the recent prominent applications of machine learning to gene-chip data, points to related tasks where machine learning might have a further impact on biology and medicine, and describes additional types of interesting data that recent advances in biotechnology allow biomedical researchers to collect.