AITopics | Xu, Sheng

Collaborating Authors

Xu, Sheng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AF2-Mutation: Adversarial Sequence Mutations against AlphaFold2 on Protein Tertiary Structure Prediction

Yuan, Zhongju, Shen, Tao, Xu, Sheng, Yu, Leiye, Ren, Ruobing, Sun, Siqi

arXiv.org Artificial IntelligenceMay-15-2023

Deep learning-based approaches, such as AlphaFold2 (AF2), have significantly advanced protein tertiary structure prediction, achieving results comparable to real biological experimental methods. While AF2 has shown limitations in predicting the effects of mutations, its robustness against sequence mutations remains to be determined. Starting with the wild-type (WT) sequence, we investigate adversarial sequences generated via an evolutionary approach, which AF2 predicts to be substantially different from WT. Our experiments on CASP14 reveal that by modifying merely three residues in the protein sequence using a combination of replacement, deletion, and insertion strategies, the alteration in AF2's predictions, as measured by the Local Distance Difference Test (lDDT), reaches 46.61. Moreover, when applied to a specific protein, SPNS2, our proposed algorithm successfully identifies biologically meaningful residues critical to protein structure determination and potentially indicates alternative conformations, thus significantly expediting the experimental process.

artificial intelligence, machine learning, sequence, (18 more...)

arXiv.org Artificial Intelligence

2305.08929

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Resilient Binary Neural Network

Xu, Sheng, Li, Yanjing, Ma, Teli, Lin, Mingbao, Dong, Hao, Zhang, Baochang, Gao, Peng, Lv, Jinhu

arXiv.org Artificial IntelligenceFeb-4-2023

Binary neural networks (BNNs) have received ever-increasing popularity for their great capability of reducing storage burden as well as quickening inference time. However, there is a severe performance drop compared with real-valued networks, due to its intrinsic frequent weight oscillation during training. In this paper, we introduce a Resilient Binary Neural Network (ReBNN) to mitigate the frequent oscillation for better BNNs' training. We identify that the weight oscillation mainly stems from the non-parametric scaling factor. To address this issue, we propose to parameterize the scaling factor and introduce a weighted reconstruction loss to build an adaptive training objective. For the first time, we show that the weight oscillation is controlled by the balanced parameter attached to the reconstruction loss, which provides a theoretical foundation to parameterize it in back propagation. Based on this, we learn our ReBNN by calculating the balanced parameter based on its maximum magnitude, which can effectively mitigate the weight oscillation with a resilient training process. Extensive experiments are conducted upon various network models, such as ResNet and Faster-RCNN for computer vision, as well as BERT for natural language processing. The results demonstrate the overwhelming performance of our ReBNN over prior arts. For example, our ReBNN achieves 66.9% Top-1 accuracy with ResNet-18 backbone on the ImageNet dataset, surpassing existing state-of-the-arts by a significant margin. Our code is open-sourced at https://github.com/SteveTsui/ReBNN.

machine learning, natural language, oscillation, (19 more...)

arXiv.org Artificial Intelligence

2302.00956

Country: Asia > China (0.29)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Collage: Automated Integration of Deep Learning Backends

Jeon, Byungsoo, Park, Sunghyun, Liao, Peiyuan, Xu, Sheng, Chen, Tianqi, Jia, Zhihao

arXiv.org Artificial IntelligenceOct-31-2021

Strong demands for efficient deployment of Deep Learning (DL) applications prompt the rapid development of a rich DL ecosystem. To keep up with its fast advancement, it is crucial for DL frameworks to efficiently integrate a variety of optimized libraries and runtimes as their backends and generate the fastest possible executable by using them properly. However, current DL frameworks require significant manual effort to integrate diverse backends and often fail to deliver high performance. In this paper, we propose Collage, an automatic framework for integrating DL backends. Collage provides a backend registration interface that allows users to precisely specify the capability of various backends. By leveraging the specifications of available backends, Collage searches for an optimized backend placement for a given workload and execution environment. Our evaluation shows that Collage automatically integrates multiple backends together without manual intervention, and outperforms existing frameworks by 1.21x, 1.39x, 1.40x on two different NVIDIA GPUs and an Intel CPU respectively.

artificial intelligence, backend, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2111.00655

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS

Chen, Lin, Xu, Sheng

arXiv.org Machine LearningOct-1-2020

We prove that the reproducing kernel Hilbert spaces (RKHS) of a deep neural tangent kernel and the Laplace kernel include the same set of functions, when both kernels are restricted to the sphere $\mathbb{S}^{d-1}$. Additionally, we prove that the exponential power kernel with a smaller power (making the kernel more non-smooth) leads to a larger RKHS, when it is restricted to the sphere $\mathbb{S}^{d-1}$ and when it is defined on the entire $\mathbb{R}^d$.

artificial intelligence, kernel, neural network, (13 more...)

arXiv.org Machine Learning

2009.10683

Country: North America > United States > California (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs

Xu, Sheng, Fan, Zhou, Negahban, Sahand

arXiv.org Machine LearningMay-31-2020

We study estimation of a gradient-sparse parameter vector $\boldsymbol{\theta}^* \in \mathbb{R}^p$, having strong gradient-sparsity $s^*:=\|\nabla_G \boldsymbol{\theta}^*\|_0$ on an underlying graph $G$. Given observations $Z_1,\ldots,Z_n$ and a smooth, convex loss function $\mathcal{L}$ for which $\boldsymbol{\theta}^*$ minimizes the population risk $\mathbb{E}[\mathcal{L}(\boldsymbol{\theta};Z_1,\ldots,Z_n)]$, we propose to estimate $\boldsymbol{\theta}^*$ by a projected gradient descent algorithm that iteratively and approximately projects gradient steps onto spaces of vectors having small gradient-sparsity over low-degree spanning trees of $G$. We show that, under suitable restricted strong convexity and smoothness assumptions for the loss, the resulting estimator achieves the squared-error risk $\frac{s^*}{n} \log (1+\frac{p}{s^*})$ up to a multiplicative constant that is independent of $G$. In contrast, previous polynomial-time algorithms have only been shown to achieve this guarantee in more specialized settings, or under additional assumptions for $G$ and/or the sparsity pattern of $\nabla_G \boldsymbol{\theta}^*$. As applications of our general framework, we apply our results to the examples of linear models and generalized linear models with random design.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Machine Learning

2006.01662

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.72)

Add feedback

Iterative Alpha Expansion for estimating gradient-sparse signals from linear measurements

Xu, Sheng, Fan, Zhou

arXiv.org Machine LearningMay-15-2019

We consider estimating a piecewise-constant image, or a gradient-sparse signal on a general graph, from noisy linear measurements. We propose and study an iterative algorithm to minimize a penalized least-squares objective, with a penalty given by the "l_0-norm" of the signal's discrete graph gradient. The method proceeds by approximate proximal descent, applying the alpha-expansion procedure to minimize a proximal gradient in each iteration, and using a geometric decay of the penalty parameter across iterations. Under a cut-restricted isometry property for the measurement design, we prove global recovery guarantees for the estimated signal. For standard Gaussian designs, the required number of measurements is independent of the graph structure, and improves upon worst-case guarantees for total-variation (TV) compressed sensing on the 1-D and 2-D lattice graphs by polynomial and logarithmic factors, respectively. The method empirically yields lower mean-squared recovery error compared with TV regularization in regimes of moderate undersampling and moderate to high signal-to-noise, for several examples of changepoint signals and gradient-sparse phantom images.

artificial intelligence, itale 0, machine learning, (17 more...)

arXiv.org Machine Learning

1905.06097

Genre: Research Report (0.63)

Technology:

Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback