lncrna
LncRNA-disease association prediction method based on heterogeneous information completion and convolutional neural network
Xi, Wen-Yu, Wang, Juan, Zhang, Yu-Lin, Liu, Jin-Xing, Gao, Yin-Lian
The emerging research shows that lncRNA has crucial research value in a series of complex human diseases. Therefore, the accurate identification of lncRNA-disease associations (LDAs) is very important for the warning and treatment of diseases. However, most of the existing methods have limitations in identifying nonlinear LDAs, and it remains a huge challenge to predict new LDAs. In this paper, a deep learning model based on a heterogeneous network and convolutional neural network (CNN) is proposed for lncRNA-disease association prediction, named HCNNLDA. The heterogeneous network containing the lncRNA, disease, and miRNA nodes, is constructed firstly. The embedding matrix of a lncRNA-disease node pair is constructed according to various biological premises about lncRNAs, diseases, and miRNAs. Then, the low-dimensional feature representation is fully learned by the convolutional neural network. In the end, the XGBoot classifier model is trained to predict the potential LDAs. HCNNLDA obtains a high AUC value of 0.9752 and AUPR of 0.9740 under the 5-fold cross-validation. The experimental results show that the proposed model has better performance than that of several latest prediction models. Meanwhile, the effectiveness of HCNNLDA in identifying novel LDAs is further demonstrated by case studies of three diseases. To sum up, HCNNLDA is a feasible calculation model to predict LDAs.
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.93)
Heterogeneous network and graph attention auto-encoder for LncRNA-disease association prediction
Liu, Jin-Xing, Xi, Wen-Yu, Dai, Ling-Yun, Zheng, Chun-Hou, Gao, Ying-Lian
The emerging research shows that lncRNAs are associated with a series of complex human diseases. However, most of the existing methods have limitations in identifying nonlinear lncRNA-disease associations (LDAs), and it remains a huge challenge to predict new LDAs. Therefore, the accurate identification of LDAs is very important for the warning and treatment of diseases. In this work, multiple sources of biomedical data are fully utilized to construct characteristics of lncRNAs and diseases, and linear and nonlinear characteristics are effectively integrated. Furthermore, a novel deep learning model based on graph attention automatic encoder is proposed, called HGATELDA. To begin with, the linear characteristics of lncRNAs and diseases are created by the miRNA-lncRNA interaction matrix and miRNA-disease interaction matrix. Following this, the nonlinear features of diseases and lncRNAs are extracted using a graph attention auto-encoder, which largely retains the critical information and effectively aggregates the neighborhood information of nodes. In the end, LDAs can be predicted by fusing the linear and nonlinear characteristics of diseases and lncRNA. The HGATELDA model achieves an impressive AUC value of 0.9692 when evaluated using a 5-fold cross-validation indicating its superior performance in comparison to several recent prediction models. Meanwhile, the effectiveness of HGATELDA in identifying novel LDAs is further demonstrated by case studies. the HGATELDA model appears to be a viable computational model for predicting LDAs.
- Asia > China > Shandong Province > Qingdao (0.04)
- Europe > United Kingdom (0.04)
- Europe > Ireland (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Deep Belief Network based representation learning for lncRNA-disease association prediction
Background: The expanding research in the field of long non-coding RNAs(lncRNAs) showed abnormal expression of lncRNAs in many complex diseases. Accurately identifying lncRNA-disease association is essential in understanding lncRNA functionality and disease mechanism. There are many machine learning techniques involved in the prediction of lncRNA-disease association which use different biological interaction networks and associated features. Feature learning from the network structured data is one of the limiting factors of machine learning-based methods. Graph neural network based techniques solve this limitation by unsupervised feature learning. Deep belief networks (DBN) are recently used in biological network analysis to learn the latent representations of network features. Method: In this paper, we propose a DBN based lncRNA-disease association prediction model (DBNLDA) from lncRNA, disease and miRNA interactions. The architecture contains three major modules-network construction, DBN based feature learning and neural network-based prediction. First, we constructed three heterogeneous networks such as lncRNA-miRNA similarity (LMS), disease-miRNA similarity (DMS) and lncRNA-disease association (LDA) network. From the node embedding matrices of similarity networks, lncRNA-disease representations were learned separately by two DBN based subnetworks. The joint representation of lncRNA-disease was learned by a third DBN from outputs of the two subnetworks mentioned. This joint feature representation was used to predict the association score by an ANN classifier. Result: The proposed method obtained AUC of 0.96 and AUPR of 0.967 when tested against standard dataset used by the state-of-the-art methods. Analysis on breast, lung and stomach cancer cases also affirmed the effectiveness of DBNLDA in predicting significant lncRNA-disease associations.
- Asia > India (0.14)
- North America > United States > New York > Albany County > Albany (0.04)
- Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Lung Cancer (0.30)
PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets
Deshpande, S., Shuttleworth, J., Yang, J., Taramonli, S., England, M.
Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) tools fail to accurately identify them in transcriptomic data. Well-known CPC tools such as CPC2, lncScore, CPAT are primarily designed for prediction of lncRNAs based on the GENCODE, NONCODE and CANTATAdb databases. The prediction accuracy of these tools often drops when tested on transcriptomic datasets. This leads to higher false positive results and inaccuracy in the function annotation process. In this study, we present a novel tool, PLIT, for the identification of lncRNAs in plants RNA-seq datasets. PLIT implements a feature selection method based on L1 regularization and iterative Random Forests (iRF) classification for selection of optimal features. Based on sequence and codon-bias features, it classifies the RNA-seq derived FASTA sequences into coding or long non-coding transcripts. Using L1 regularization, 31 optimal features were obtained based on lncRNA and protein-coding transcripts from 8 plant species. The performance of the tool was evaluated on 7 plant RNA-seq datasets using 10-fold cross-validation. The analysis exhibited superior accuracy when evaluated against currently available state-of-the-art CPC tools.
- North America > United States (0.14)
- Europe > United Kingdom > England > Warwickshire (0.04)