AITopics | Wang, Yadong

Collaborating Authors

Wang, Yadong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MSNGO: multi-species protein function annotation based on 3D protein structure and network propagation

Wang, Beibei, Cui, Boyue, Chen, Shiqu, Wang, Xuan, Wang, Yadong, Li, Junyi

arXiv.org Artificial IntelligenceMar-29-2025

Motivation: In recent years, protein function prediction has broken through the bottleneck of sequence features, significantly improving prediction accuracy using high-precision protein structures predicted by AlphaFold2. While single-species protein function prediction methods have achieved remarkable success, multi-species protein function prediction methods are still in the stage of using PPI networks and sequence features. Providing effective cross-species label propagation for species with sparse protein annotations remains a challenging issue. To address this problem, we propose the MSNGO model, which integrates structural features and network propagation methods. Our validation shows that using structural features can significantly improve the accuracy of multi-species protein function prediction. Results: We employ graph representation learning techniques to extract amino acid representations from protein structure contact maps and train a structural model using a graph convolution pooling module to derive protein-level structural features. After incorporating the sequence features from ESM-2, we apply a network propagation algorithm to aggregate information and update node representations within a heterogeneous network. The results demonstrate that MSNGO outperforms previous multi-species protein function prediction methods that rely on sequence features and PPI networks. Availability: https://github.com/blingbell/MSNGO.

artificial intelligence, bioinformatics, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.23014

Country:

Asia > China (0.30)
North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Combinatorial Testing for Deep Learning Systems

Ma, Lei, Zhang, Fuyuan, Xue, Minhui, Li, Bo, Liu, Yang, Zhao, Jianjun, Wang, Yadong

arXiv.org Artificial IntelligenceJun-20-2018

Deep learning (DL) has achieved remarkable progress over the past decade and been widely applied to many safety-critical applications. However, the robustness of DL systems recently receives great concerns, such as adversarial examples against computer vision systems, which could potentially result in severe consequences. Adopting testing techniques could help to evaluate the robustness of a DL system and therefore detect vulnerabilities at an early stage. The main challenge of testing such systems is that its runtime state space is too large: if we view each neuron as a runtime state for DL, then a DL system often contains massive states, rendering testing each state almost impossible. For traditional software, combinatorial testing (CT) is an effective testing technique to reduce the testing space while obtaining relatively high defect detection abilities. In this paper, we perform an exploratory study of CT on DL systems. We adapt the concept in CT and propose a set of coverage criteria for DL systems, as well as a CT coverage guided test generation technique. Our evaluation demonstrates that CT provides a promising avenue for testing DL systems. We further pose several open questions and interesting directions for combinatorial testing of DL systems.

combinatorial testing, deep learning, neural network, (22 more...)

arXiv.org Artificial Intelligence

1806.07723

Country:

North America > United States (0.46)
Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.94)
Information Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DeepGauge: Comprehensive and Multi-Granularity Testing Criteria for Gauging the Robustness of Deep Learning Systems

Ma, Lei, Juefei-Xu, Felix, Sun, Jiyuan, Chen, Chunyang, Su, Ting, Zhang, Fuyuan, Xue, Minhui, Li, Bo, Li, Li, Liu, Yang, Zhao, Jianjun, Wang, Yadong

arXiv.org Machine LearningMar-20-2018

Deep learning defines a new data-driven programming paradigm that constructs the internal system logic of a crafted neuron network through a set of training data. Deep learning (DL) has been widely adopted in many safety-critical scenarios. However, a plethora of studies have shown that the state-of-the-art DL systems suffer from various vulnerabilities which can lead to severe consequences when applied to real-world applications. Currently, the robustness of a DL system against adversarial attacks is usually measured by the accuracy of test data. Considering the limitation of accessible test data, good performance on test data can hardly guarantee the robustness and generality of DL systems. Different from traditional software systems which have clear and controllable logic and functionality, a DL system is trained with data and lacks thorough understanding. This makes it difficult for system analysis and defect detection, which could potentially hinder its real-world deployment without safety guarantees. In this paper, we propose DeepGauge, a comprehensive and multi-granularity testing criteria for DL systems, which renders a complete and multi-faceted portrayal of the testbed. The in-depth evaluation of our proposed testing criteria is demonstrated on two well-known datasets, five DL systems, with four state-of-the-art adversarial data generation techniques. The effectiveness of DeepGauge sheds light on the construction of robust DL systems.

deep learning, neural network, test orig, (21 more...)

arXiv.org Machine Learning

1803.07519

Country:

Asia (0.92)
North America > United States > New York (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.81)

Industry:

Information Technology > Security & Privacy (0.49)
Government > Military (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparison among dimensionality reduction techniques based on Random Projection for cancer classification

Xie, Haozhe, Li, Jie, Zhang, Qiaosheng, Wang, Yadong

arXiv.org Machine LearningJun-17-2017

Random Projection (RP) technique has been widely applied in many scenarios because it can reduce high-dimensional features into low-dimensional space within short time and meet the need of real-time analysis of massive data. There is an urgent need of dimensionality reduction with fast increase of big genomics data. However, the performance of RP is usually lower. We attempt to improve classification accuracy of RP through combining other reduction dimension methods such as Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Feature Selection (FS). We compared classification accuracy and running time of different combination methods on three microarray datasets and a simulation dataset. Experimental results show a remarkable improvement of 14.77% in classification accuracy of FS followed by RP compared to RP on BC-TCGA dataset. LDA followed by RP also helps RP to yield a more discriminative subspace with an increase of 13.65% on classification accuracy on the same dataset. FS followed by RP outperforms other combination methods in classification accuracy on most of the datasets.

classification accuracy, health & medicine, oncology, (16 more...)

arXiv.org Machine Learning

doi: 10.1016/j.compbiolchem.2016.09.010

1608.07019

Country:

North America > United States > New Jersey (0.14)
Asia > China > Heilongjiang Province (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.62)

Add feedback