AITopics | Bao, Chenglong

Collaborating Authors

Bao, Chenglong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Regularized Newton Method for Nonconvex Optimization with Global and Local Complexity Guarantees

Zhou, Yuhao, Xu, Jintao, Bao, Chenglong, Ding, Chao, Zhu, Jun

arXiv.org Artificial IntelligenceFeb-14-2025

We consider the problem of finding an $\epsilon$-stationary point of a nonconvex function with a Lipschitz continuous Hessian and propose a quadratic regularized Newton method incorporating a new class of regularizers constructed from the current and previous gradients. The method leverages a recently developed linear conjugate gradient approach with a negative curvature monitor to solve the regularized Newton equation. Notably, our algorithm is adaptive, requiring no prior knowledge of the Lipschitz constant of the Hessian, and achieves a global complexity of $O(\epsilon^{-\frac{3}{2}}) + \tilde O(1)$ in terms of the second-order oracle calls, and $\tilde O(\epsilon^{-\frac{7}{4}})$ for Hessian-vector products, respectively. Moreover, when the iterates converge to a point where the Hessian is positive definite, the method exhibits quadratic local convergence. Preliminary numerical results illustrate the competitiveness of our algorithm.

artificial intelligence, lemma 3, optimization problem, (13 more...)

arXiv.org Artificial Intelligence

2502.04799

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.61)

Add feedback

Convection-Diffusion Equation: A Theoretically Certified Framework for Neural Networks

Wang, Tangjun, Bao, Chenglong, Shi, Zuoqiang

arXiv.org Artificial IntelligenceMar-23-2024

In this paper, we study the partial differential equation models of neural networks. Neural network can be viewed as a map from a simple base model to a complicate function. Based on solid analysis, we show that this map can be formulated by a convection-diffusion equation. This theoretically certified framework gives mathematical foundation and more understanding of neural networks. Moreover, based on the convection-diffusion equation model, we design a novel network structure, which incorporates diffusion mechanism into network architecture. Extensive experiments on both benchmark datasets and real-world applications validate the performance of the proposed model.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2403.15726

Country:

Europe > United Kingdom (0.14)
North America > United States (0.14)
Asia > Cambodia (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Energy > Oil & Gas > Upstream (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

Reconstruction of dynamical systems from data without time labels

Zeng, Zhijun, Hu, Pipi, Bao, Chenglong, Zhu, Yi, Shi, Zuoqiang

arXiv.org Artificial IntelligenceDec-6-2023

In this paper, we study the method to reconstruct dynamical systems from data without time labels. Data without time labels appear in many applications, such as molecular dynamics, single-cell RNA sequencing etc. Reconstruction of dynamical system from time sequence data has been studied extensively. However, these methods do not apply if time labels are unknown. Without time labels, sequence data becomes distribution data. Based on this observation, we propose to treat the data as samples from a probability distribution and try to reconstruct the underlying dynamical system by minimizing the distribution loss, sliced Wasserstein distance more specifically. Extensive experiment results demonstrate the effectiveness of the proposed method.

data mining, machine learning, trajectory, (21 more...)

arXiv.org Artificial Intelligence

2312.04038

Country: Asia > China (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

Addressing preferred orientation in single-particle cryo-EM through AI-generated auxiliary particles

Zhang, Hui, Zheng, Dihan, Wu, Qiurong, Yan, Nieng, Shi, Zuoqiang, Hu, Mingxu, Bao, Chenglong

arXiv.org Artificial IntelligenceSep-26-2023

The single-particle cryo-EM field faces the persistent challenge of preferred orientation, lacking general computational solutions. We introduce cryoPROS, an AI-based approach designed to address the above issue. By generating the auxiliary particles with a conditional deep generative model, cryoPROS addresses the intrinsic bias in orientation estimation for the observed particles. We effectively employed cryoPROS in the cryo-EM single particle analysis of the hemagglutinin trimer, showing the ability to restore the near-atomic resolution structure on non-tilt data. Moreover, the enhanced version named cryoPROS-MP significantly improves the resolution of the membrane protein NaX using the no-tilted data that contains the effects of micelles. Compared to the classical approaches, cryoPROS does not need special experimental or image acquisition techniques, providing a purely computational yet effective solution for the preferred orientation problem. Finally, we conduct extensive experiments that establish the low risk of model bias and the high robustness of cryoPROS.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2309.14954

Country:

Asia > China (0.30)
North America > United States (0.28)
Africa > South Africa > Western Cape > Indian Ocean (0.24)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

An axiomatized PDE model of deep neural networks

Wang, Tangjun, Tao, Wenqi, Bao, Chenglong, Shi, Zuoqiang

arXiv.org Artificial IntelligenceJul-23-2023

Inspired by the relation between deep neural network (DNN) and partial differential equations (PDEs), we study the general form of the PDE models of deep neural networks. To achieve this goal, we formulate DNN as an evolution operator from a simple base model. Based on several reasonable assumptions, we prove that the evolution operator is actually determined by convection-diffusion equation. This convection-diffusion equation model gives mathematical explanation for several effective networks. Moreover, we show that the convection-diffusion model improves the robustness and reduces the Rademacher complexity. Based on the convection-diffusion equation, we design a new training method for ResNets. Experiments validate the performance of the proposed method.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2307.12333

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Diffusion Mechanism in Residual Neural Network: Theory and Applications

Wang, Tangjun, Dou, Zehao, Bao, Chenglong, Shi, Zuoqiang

arXiv.org Artificial IntelligenceApr-28-2023

Diffusion, a fundamental internal mechanism emerging in many physical processes, describes the interaction among different objects. In many learning tasks with limited training samples, the diffusion connects the labeled and unlabeled data points and is a critical component for achieving high classification accuracy. Many existing deep learning approaches directly impose the fusion loss when training neural networks. In this work, inspired by the convection-diffusion ordinary differential equations (ODEs), we propose a novel diffusion residual network (Diff-ResNet), internally introduces diffusion into the architectures of neural networks. Under the structured data assumption, it is proved that the proposed diffusion block can increase the distance-diameter ratio that improves the separability of inter-class points and reduces the distance among local intra-class points. Moreover, this property can be easily adopted by the residual networks for constructing the separable hyperplanes. Extensive experiments of synthetic binary classification, semi-supervised graph node classification and few-shot image classification in various datasets validate the effectiveness of the proposed method.

artificial intelligence, diffusion, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2105.03155

Country:

Asia (0.28)
North America > United States (0.28)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Convergence Rates of Training Deep Neural Networks via Alternating Minimization Methods

Xu, Jintao, Bao, Chenglong, Xing, Wenxun

arXiv.org Artificial IntelligenceApr-4-2023

Training deep neural networks (DNNs) is an important and challenging optimization problem in machine learning due to its non-convexity and non-separable structure. The alternating minimization (AM) approaches split the composition structure of DNNs and have drawn great interest in the deep learning and optimization communities. In this paper, we propose a unified framework for analyzing the convergence rate of AM-type network training methods. Our analysis is based on the non-monotone $j$-step sufficient decrease conditions and the Kurdyka-Lojasiewicz (KL) property, which relaxes the requirement of designing descent algorithms. We show the detailed local convergence rate if the KL exponent $\theta$ varies in $[0,1)$. Moreover, the local R-linear convergence is discussed under a stronger $j$-step sufficient decrease condition.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2208.14318

Country: Asia > China (0.15)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interpolation between Residual and Non-Residual Networks

Yang, Zonghan, Liu, Yang, Bao, Chenglong, Shi, Zuoqiang

arXiv.org Machine LearningAug-16-2020

Although ordinary differential equations (ODEs) provide insights for designing network architectures, its relationship with the non-residual convolutional neural networks (CNNs) is still unclear. In this paper, we present a novel ODE model by adding a damping term. It can be shown that the proposed model can recover both a ResNet and a CNN by adjusting an interpolation coefficient. Therefore, the damped ODE model provides a unified framework for the interpretation of residual and non-residual networks. The Lyapunov analysis reveals better stability of the proposed model, and thus yields robustness improvement of the learned networks. Experiments on a number of image classification benchmarks show that the proposed model substantially improves the accuracy of ResNet and ResNeXt over the perturbed inputs from both stochastic noise and adversarial attack methods. Moreover, the loss landscape analysis demonstrates the improved robustness of our method along the attack direction.

deep learning, neural network, robustness, (16 more...)

arXiv.org Machine Learning

2006.05749

Country: Asia > China (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Brain-inspired reverse adversarial examples

Ye, Shaokai, Tan, Sia Huat, Xu, Kaidi, Wang, Yanzhi, Bao, Chenglong, Ma, Kaisheng

arXiv.org Artificial IntelligenceMay-27-2019

A human does not have to see all elephants to recognize an animal as an elephant. On contrast, current state-of-the-art deep learning approaches heavily depend on the variety of training samples and the capacity of the network. In practice, the size of network is always limited and it is impossible to access all the data samples. Under this circumstance, deep learning models are extremely fragile to human-imperceivable adversarial examples, which impose threats to all safety critical systems. Inspired by the association and attention mechanisms of the human brain, we propose reverse adversarial examples method that can greatly improve models' robustness on unseen data. Experiments show that our reverse adversarial method can improve accuracy on average 19.02% on ResNet18, MobileNet, and VGG16 on unseen data transformation. Besides, the proposed method is also applicable to compressed models and shows potential to compensate the robustness drop brought by model quantization - an absolute 30.78% accuracy improvement.

calibrater, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1905.12171

Country: Asia (0.14)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.97)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models

Zhang, Linfeng, Tan, Zhanhong, Song, Jiebo, Chen, Jingwei, Bao, Chenglong, Ma, Kaisheng

arXiv.org Machine LearningMay-27-2019

Remarkable achievements have been attained by deep neural networks in various applications. However, the increasing depth and width of such models also lead to explosive growth in both storage and computation, which has restricted the deployment of deep neural networks on resource-limited edge devices. To address this problem, we propose the so-called SCAN framework for networks training and inference, which is orthogonal and complementary to existing acceleration and compression methods. The proposed SCAN firstly divides neural networks into multiple sections according to their depth and constructs shallow classifiers upon the intermediate features of different sections. Moreover, attention modules and knowledge distillation are utilized to enhance the accuracy of shallow classifiers. Based on this architecture, we further propose a threshold controlled scalable inference mechanism to approach human-like sample-specific inference. Experimental results show that SCAN can be easily equipped on various neural networks without any adjustment on hyper-parameters or neural networks architectures, yielding significant performance gain on CIFAR100 and ImageNet. Codes will be released on github soon.

classifier, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1906.03951

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback