AITopics | Guo, Zheng-Chu

Collaborating Authors

Guo, Zheng-Chu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Spectral Algorithms under Covariate Shift

Fan, Jun, Guo, Zheng-Chu, Shi, Lei

arXiv.org Machine LearningApr-17-2025

Spectral algorithms leverage spectral regularization techniques to analyze and process data, providing a flexible framework for addressing supervised learning problems. To deepen our understanding of their performance in real-world scenarios where the distributions of training and test data may differ, we conduct a rigorous investigation into the convergence behavior of spectral algorithms under distribution shifts, specifically within the framework of reproducing kernel Hilbert spaces. Our study focuses on the case of covariate shift. In this scenario, the marginal distributions of the input data differ between the training and test datasets, while the conditional distribution of the output given the input remains unchanged. Under this setting, we analyze the generalization error of spectral algorithms and show that they achieve minimax optimality when the density ratios between the training and test distributions are uniformly bounded. However, we also identify a critical limitation: when the density ratios are unbounded, the spectral algorithms may become suboptimal. To address this limitation, we propose a weighted spectral algorithm that incorporates density ratio information into the learning process. Our theoretical analysis shows that this weighted approach achieves optimal capacity-independent convergence rates. Furthermore, by introducing a weight clipping technique, we demonstrate that the convergence rates of the weighted spectral algorithm can approach the optimal capacity-dependent convergence rates arbitrarily closely. This improvement resolves the suboptimality issue in unbounded density ratio scenarios and advances the state-of-the-art by refining existing theoretical results.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Machine Learning

2504.12625

Country: Asia > China (0.46)

Genre: Research Report (0.63)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.34)

Add feedback

Stochastic Gradient Descent for Two-layer Neural Networks

Cao, Dinghao, Guo, Zheng-Chu, Shi, Lei

arXiv.org Machine LearningJul-10-2024

This paper presents a comprehensive study on the convergence rates of the stochastic gradient descent (SGD) algorithm when applied to overparameterized two-layer neural networks. Our approach combines the Neural Tangent Kernel (NTK) approximation with convergence analysis in the Reproducing Kernel Hilbert Space (RKHS) generated by NTK, aiming to provide a deep understanding of the convergence behavior of SGD in overparameterized two-layer neural networks. Our research framework enables us to explore the intricate interplay between kernel methods and optimization processes, shedding light on the optimization dynamics and convergence properties of neural networks. In this study, we establish sharp convergence rates for the last iterate of the SGD algorithm in overparameterized two-layer neural networks. Additionally, we have made significant advancements in relaxing the constraints on the number of neurons, which have been reduced from exponential dependence to polynomial dependence on the sample size or number of iterations. This improvement allows for more flexibility in the design and scaling of neural networks, and will deepen our theoretical understanding of neural network models trained with SGD.

artificial intelligence, machine learning, neural network, (14 more...)

arXiv.org Machine Learning

2407.0767

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Optimality of Robust Online Learning

Guo, Zheng-Chu, Christmann, Andreas, Shi, Lei

arXiv.org Artificial IntelligenceApr-19-2023

In this paper, we study an online learning algorithm with a robust loss function $\mathcal{L}_{\sigma}$ for regression over a reproducing kernel Hilbert space (RKHS). The loss function $\mathcal{L}_{\sigma}$ involving a scaling parameter $\sigma>0$ can cover a wide range of commonly used robust losses. The proposed algorithm is then a robust alternative for online least squares regression aiming to estimate the conditional mean function. For properly chosen $\sigma$ and step size, we show that the last iterate of this online algorithm can achieve optimal capacity independent convergence in the mean square distance. Moreover, if additional information on the underlying function space is known, we also establish optimal capacity dependent rates for strong convergence in RKHS. To the best of our knowledge, both of the two results are new to the existing literature of online learning.

artificial intelligence, machine learning, null 2, (15 more...)

arXiv.org Artificial Intelligence

2304.1006

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.92)

Add feedback

Online Regularized Learning Algorithm for Functional Data

Mao, Yuan, Guo, Zheng-Chu

arXiv.org Artificial IntelligenceNov-24-2022

In recent years, functional linear models have attracted growing attention in statistics and machine learning, with the aim of recovering the slope function or its functional predictor. This paper considers online regularized learning algorithm for functional linear models in reproducing kernel Hilbert spaces. Convergence analysis of excess prediction error and estimation error are provided with polynomially decaying step-size and constant step-size, respectively. Fast convergence rates can be derived via a capacity dependent analysis. By introducing an explicit regularization term, we uplift the saturation boundary of unregularized online learning algorithms when the step-size decays polynomially, and establish fast convergence rates of estimation error without capacity assumption. However, it remains an open problem to obtain capacity independent convergence rates for the estimation error of the unregularized online learning algorithm with decaying step-size. It also shows that convergence rates of both prediction error and estimation error with constant step-size are competitive with those in the literature.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2211.13549

Genre:

Research Report (0.82)
Instructional Material > Online (0.60)

Industry: Education > Educational Setting > Online (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Realizing data features by deep nets

Guo, Zheng-Chu, Shi, Lei, Lin, Shao-Bo

arXiv.org Machine LearningJan-1-2019

This paper considers the power of deep neural networks (deep nets for short) in realizing data features. Based on refined covering number estimates, we find that, to realize some complex data features, deep nets can improve the performances of shallow neural networks (shallow nets for short) without requiring additional capacity costs. This verifies the advantage of deep nets in realizing complex features. On the other hand, to realize some simple data feature like the smoothness, we prove that, up to a logarithmic factor, the approximation rate of deep nets is asymptotically identical to that of shallow nets, provided that the depth is fixed. This exhibits a limitation of deep nets in realizing simple features.

deep learning, deep net, neural network, (18 more...)

arXiv.org Machine Learning

1901.0013

Country: Asia > China (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Fast and Strong Convergence of Online Learning Algorithms

Guo, Zheng-Chu, Shi, Lei

arXiv.org Machine LearningOct-10-2017

In this paper, we study the online learning algorithm without explicit regularization terms. This algorithm is essentially a stochastic gradient descent scheme in a reproducing kernel Hilbert space (RKHS). The polynomially decaying step size in each iteration can play a role of regularization to ensure the generalization ability of online learning algorithm. We develop a novel capacity dependent analysis on the performance of the last iterate of online learning algorithm. The contribution of this paper is two-fold. First, our nice analysis can lead to the convergence rate in the standard mean square distance which is the best so far. Second, we establish, for the first time, the strong convergence of the last iterate with polynomially decaying step sizes in the RKHS norm. We demonstrate that the theoretical analysis established in this paper fully exploits the fine structure of the underlying RKHS, and thus can lead to sharp error estimates of online learning algorithm.

algorithm, computer based training, educational technology, (19 more...)

arXiv.org Machine Learning

1710.036

Country:

Asia > China (0.46)
North America > United States > New York (0.14)

Genre: Research Report (0.64)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Learning from networked examples

Wang, Yuyi, Ramon, Jan, Guo, Zheng-Chu

arXiv.org Artificial IntelligenceJun-3-2017

Many machine learning algorithms are based on the assumption that training examples are drawn independently. However, this assumption does not hold anymore when learning from a networked sample because two or more training examples may share some common objects, and hence share the features of these shared objects. We show that the classic approach of ignoring this problem potentially can have a harmful effect on the accuracy of statistics, and then consider alternatives. One of these is to only use independent examples, discarding other information. However, this is clearly suboptimal. We analyze sample error bounds in this networked setting, providing significantly improved results. An important component of our approach is formed by efficient sample weighting schemes, which leads to novel concentration inequalities.

artificial intelligence, inductive learning, inequality, (17 more...)

arXiv.org Artificial Intelligence

1405.26

Country:

Europe > United Kingdom > England (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.69)

Add feedback

Learning from networked examples in a k-partite graph

Wang, Yuyi, Ramon, Jan, Guo, Zheng-Chu

arXiv.org Machine LearningFeb-17-2017

Many machine learning algorithms are based on the assumption that training examples are drawn independently. However, this assumption does not hold anymore when learning from a networked sample where two or more training examples may share common features. We propose an efficient weighting method for learning from networked examples and show the sample error bound which is better than previous work.

artificial intelligence, inductive learning, sample error, (20 more...)

arXiv.org Machine Learning

1306.0393

Country: Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Generalization Bounds for Metric and Similarity Learning

Cao, Qiong, Guo, Zheng-Chu, Ying, Yiming

arXiv.org Machine LearningMar-17-2013

Recently, metric learning and similarity learning have attracted a large amount of interest. Many models and optimisation algorithms have been proposed. However, there is relatively little work on the generalization analysis of such methods. In this paper, we derive novel generalization bounds of metric and similarity learning. In particular, we first show that the generalization analysis reduces to the estimation of the Rademacher average over "sums-of-i.i.d." sample-blocks related to the specific matrix norm. Then, we derive generalization bounds for metric/similarity learning with different matrix-norm regularisers by estimating their specific Rademacher complexities. Our analysis indicates that sparse metric/similarity learning with $L^1$-norm regularisation could lead to significantly better bounds than those with Frobenius-norm regularisation. Our novel generalization analysis develops and refines the techniques of U-statistics and Rademacher complexity analysis.

artificial intelligence, evolutionary algorithm, similarity learning, (17 more...)

arXiv.org Machine Learning

1207.5437

Country:

North America > United States (0.14)
Europe > United Kingdom (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback