Multi-modal Protein Knowledge Graph Construction and Applications
Cheng, Siyuan, Liang, Xiaozhuan, Bi, Zhen, Chen, Huajun, Zhang, Ningyu
–arXiv.org Artificial Intelligence
Existing data-centric methods for protein science generally cannot sufficiently capture and leverage biology knowledge, which may be crucial for many protein tasks. To facilitate research in this field, we create ProteinKG65, a knowledge graph for protein science. Using gene ontology and Uniprot knowledge base as a basis, we transform and integrate various kinds of knowledge with aligned descriptions and protein sequences, respectively, to GO terms and protein entities. ProteinKG65 is mainly dedicated to providing a specialized protein knowledge graph, bringing the knowledge of Gene Ontology to protein function and structure prediction. We also illustrate the potential applications of ProteinKG65 with a prototype. Our dataset can be downloaded at https://w3id.org/proteinkg65.
arXiv.org Artificial Intelligence
Nov-14-2022