Not enough data to create a plot.
Try a different view from the menu above.
Chen, Weijie
FFF: Fragments-Guided Flexible Fitting for Building Complete Protein Structures
Chen, Weijie, Wang, Xinyan, Wang, Yuhang
Cryo-electron microscopy (cryo-EM) is a technique for reconstructing the 3-dimensional (3D) structure of biomolecules (especially large protein complexes and molecular assemblies). As the resolution increases to the near-atomic scale, building protein structures de novo from cryo-EM maps becomes possible. Recently, recognition-based de novo building methods have shown the potential to streamline this process. However, it cannot build a complete structure due to the low signal-to-noise ratio (SNR) problem. At the same time, AlphaFold has led to a great breakthrough in predicting protein structures. This has inspired us to combine fragment recognition and structure prediction methods to build a complete structure. In this paper, we propose a new method named FFF that bridges protein structure prediction and protein structure recognition with flexible fitting. First, a multi-level recognition network is used to capture various structural features from the input 3D cryo-EM map. Next, protein structural fragments are generated using pseudo peptide vectors and a protein sequence alignment method based on these extracted features. Finally, a complete structural model is constructed using the predicted protein fragments via flexible fitting. Based on our benchmark tests, FFF outperforms the baseline methods for building complete protein structures.
Twin support vector quantile regression
Ye, Yafen, Xu, Zhihu, Zhang, Jinhua, Chen, Weijie, Shao, Yuanhai
We propose a twin support vector quantile regression (TSVQR) to capture the heterogeneous and asymmetric information in modern data. Using a quantile parameter, TSVQR effectively depicts the heterogeneous distribution information with respect to all portions of data points. Correspondingly, TSVQR constructs two smaller sized quadratic programming problems (QPPs) to generate two nonparallel planes to measure the distributional asymmetry between the lower and upper bounds at each quantile level. The QPPs in TSVQR are smaller and easier to solve than those in previous quantile regression methods. Moreover, the dual coordinate descent algorithm for TSVQR also accelerates the training speed. Experimental results on six artiffcial data sets, ffve benchmark data sets, two large scale data sets, two time-series data sets, and two imbalanced data sets indicate that the TSVQR outperforms previous quantile regression methods in terms of the effectiveness of completely capturing the heterogeneous and asymmetric information and the efffciency of the learning process.
Fast and Robust Rank Aggregation against Model Misspecification
Pan, Yuangang, Chen, Weijie, Niu, Gang, Tsang, Ivor W., Sugiyama, Masashi
In rank aggregation, preferences from different users are summarized into a total order under the homogeneous data assumption. Thus, model misspecification arises and rank aggregation methods take some noise models into account. However, they all rely on certain noise model assumptions and cannot handle agnostic noises in the real world. In this paper, we propose CoarsenRank, which rectifies the underlying data distribution directly and aligns it to the homogeneous data assumption without involving any noise model. To this end, we define a neighborhood of the data distribution over which Bayesian inference of CoarsenRank is performed, and therefore the resultant posterior enjoys robustness against model misspecification. Further, we derive a tractable closed-form solution for CoarsenRank making it computationally efficient. Experiments on real-world datasets show that CoarsenRank is fast and robust, achieving consistent improvement over baseline methods.