Transformer-based toxin-protein interaction analysis prioritizes airborne particulate matter components with potential adverse health effects

Zhu, Yan, Wang, Shihao, Han, Yong, Lu, Yao, Qiu, Shulan, Jin, Ling, Li, Xiangdong, Zhang, Weixiong

arXiv.org Artificial Intelligence 

Air pollution, particularly airborne particulate matter (PM), poses a significant threat to public health globally. It is crucial to comprehend the association between PM-associated toxic components and their cellular targets in humans to understand the mechanisms by which air pollution impacts health and to establish causal relationships between air pollution and public health consequences. Current methods for modeling and analyzing these interactions are rudimentary, with experimental approaches offering limited throughput and comprehensiveness. Leveraging cutting-edge deep learning technologies, we developed tipFormer (toxin-protein interaction prediction based on transformer), a novel machine-learning approach for identifying toxic components capable of penetrating human cells and instigating pathogenic biological activities and signaling cascades. It incorporates dual pre-trained language models to derive encodings for protein sequences and chemicals. It employs a convolutional encoder to assimilate the sequential attributes of proteins and chemicals. It then introduces a novel learning module with a cross-attention mechanism to decode and elucidate the multifaceted interactions pivotal for the hotspots binding proteins and chemicals. Through thorough experimentation, tipFormer was shown to be proficient in capturing interactions between proteins and toxic components. This approach offers significant value to the air quality and toxicology research communities by enabling high-throughput, high-content identification and prioritization of hazards. Keywords: Air pollution, toxin-protein interaction, computational modeling, attention mechanisms 1. Introduction Air pollution has emerged as a critical global health concern, primarily driven by rapid economic, industrial and population growth and further exacerbated by climate change and other non-anthropogenic factors [1]. The World Health Organization estimates that approximately 7 million premature deaths occur every year due to air pollution exposure. The consequences of air pollution extend far beyond individual health implications and exacerbate the strain on societal and healthcare systems in numerous ways [2]. The health risks associated with airborne particulate matter (PM) are particularly concerning for public health [3].