Zhao, Chenchao
AffinityFlow: Guided Flows for Antibody Affinity Maturation
Chen, Can, Herpoldt, Karla-Luise, Zhao, Chenchao, Wang, Zichen, Collins, Marcus, Shang, Shang, Benson, Ron
Antibodies are widely used as therapeutics, but their development requires costly affinity maturation, involving iterative mutations to enhance binding affinity.This paper explores a sequence-only scenario for affinity maturation, using solely antibody and antigen sequences. Recently AlphaFlow wraps AlphaFold within flow matching to generate diverse protein structures, enabling a sequence-conditioned generative model of structure. Building on this, we propose an alternating optimization framework that (1) fixes the sequence to guide structure generation toward high binding affinity using a structure-based affinity predictor, then (2) applies inverse folding to create sequence mutations, refined by a sequence-based affinity predictor for post selection. A key challenge is the lack of labeled data for training both predictors. To address this, we develop a co-teaching module that incorporates valuable information from noisy biophysical energies into predictor refinement. The sequence-based predictor selects consensus samples to teach the structure-based predictor, and vice versa. Our method, AffinityFlow, achieves state-of-the-art performance in affinity maturation experiments. We plan to open-source our code after acceptance.
Quantum transport senses community structure in networks
Zhao, Chenchao, Song, Jun S.
Quantum time evolution exhibits rich physics, attributable to the interplay between the density and phase of a wave function. However, unlike classical heat diffusion, the wave nature of quantum mechanics has not yet been extensively explored in modern data analysis. We propose that the Laplace transform of quantum transport (QT) can be used to construct an ensemble of maps from a given complex network to a circle $S^1$, such that closely-related nodes on the network are grouped into sharply concentrated clusters on $S^1$. The resulting QT clustering (QTC) algorithm is as powerful as the state-of-the-art spectral clustering in discerning complex geometric patterns and more robust when clusters show strong density variations or heterogeneity in size. The observed phenomenon of QTC can be interpreted as a collective behavior of the microscopic nodes that evolve as macroscopic cluster orbitals in an effective tight-binding model recapitulating the network. Python source code implementing the algorithm and examples are available at https://github.com/jssong-lab/QTC.
Exact heat kernel on a hypersphere and its applications in kernel SVM
Zhao, Chenchao, Song, Jun S.
Many contemporary statistical learning methods assume a Euclidean feature space. This paper presents a method for defining similarity based on hyperspherical geometry and shows that it often improves the performance of support vector machine compared to other competing similarity measures. Specifically, the idea of using heat diffusion on a hypersphere to measure similarity has been previously proposed, demonstrating promising results based on a heuristic heat kernel obtained from the zeroth order parametrix expansion; however, how well this heuristic kernel agrees with the exact hyperspherical heat kernel remains unknown. This paper presents a higher order parametrix expansion of the heat kernel on a unit hypersphere and discusses several problems associated with this expansion method. We then compare the heuristic kernel with an exact form of the heat kernel expressed in terms of a uniformly and absolutely convergent series in high-dimensional angular momentum eigenmodes. Being a natural measure of similarity between sample points dwelling on a hypersphere, the exact kernel often shows superior performance in kernel SVM classifications applied to text mining, tumor somatic mutation imputation, and stock market analysis.