Multidimensional Scaling for Gene Sequence Data with Autoencoders

Apr-18-2021–arXiv.org Artificial Intelligence

Multidimensional scaling of gene sequence data has long played a vital role in analysing gene sequence data to identify clusters and patterns. However the computation complexities and memory requirements of state-of-the-art dimensional scaling algorithms make it infeasible to scale to large datasets. In this paper we present an autoencoder-based dimensional reduction model which can easily scale to datasets containing millions of gene sequences, while attaining results comparable to state-of-the-art MDS algorithms with minimal resource requirements. The model also supports out-of-sample data points with a 99.5%+ accuracy based on our experiments. The proposed model is evaluated against DAMDS with a real world fungi gene sequence dataset. The presented results showcase the effectiveness of the autoencoder-based dimension reduction model and its advantages.

autoencoder, dataset, sequence, (13 more...)

arXiv.org Artificial Intelligence

Apr-18-2021

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Indiana
    - Monroe County > Bloomington (0.04)
  - Canada > Ontario
    - Waterloo Region > Waterloo (0.04)
- Europe > Sweden
  - Uppsala County > Uppsala (0.04)

Genre:
- Research Report (0.82)

Industry:
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:
- Information Technology
  - Biomedical Informatics > Translational Bioinformatics (1.00)
  - Artificial Intelligence > Machine Learning
    - Neural Networks > Deep Learning (0.94)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found