SBSM-Pro: Support Bio-sequence Machine for Proteins

Wang, Yizheng, Zhai, Yixiao, Ding, Yijie, Zou, Quan

arXiv.org Artificial Intelligence 

Bio-sequences, which include DNA, RNA, and proteins, are the molecular foundation of modern genetic research. The classification of bio-sequences based on sequence information has been a key focus in bioinformatics research. At present, with the sequential completion of genome mapping from humans to various species, we have amassed a vast amount of sequence data, creating an urgent need for computer-assisted annotation of sequence functions. Although it is statistically evident that genetic sequences determine hereditary diseases, the mechanisms by which sequence variations contribute to diseases are intricately complex. It is difficult to address and interpret all these issues through one biological experiment; hence, multiple computer predictions are needed to guide the progression of wet lab exploration. In summary, the application of information science and machine learning to bio-sequence classification is a valuable tool for assisting researchers in comprehending and analysing bio-sequences. It serves as a key driving force for advancing research in the field of bioinformatics. In the field of bio-sequence classification, machine learning methods are broadly pursued using two strategies: feature extraction combined with traditional classification methods and direct sequence classification via deep learning techniques. For bio-sequences, relevant features are mainly characterized as frequency, physicochemical, structural, and evolutionary features.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found