AITopics | Yang, Yunfei

Plotting

Yang, Yunfei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model-Guardian: Protecting against Data-Free Model Stealing Using Gradient Representations and Deceptive Predictions

Yang, Yunfei, Chen, Xiaojun, Xuan, Yuexin, Zhao, Zhendong

arXiv.org Artificial IntelligenceMar-23-2025

Model stealing attack is increasingly threatening the confidentiality of machine learning models deployed in the cloud. Recent studies reveal that adversaries can exploit data synthesis techniques to steal machine learning models even in scenarios devoid of real data, leading to data-free model stealing attacks. Existing defenses against such attacks suffer from limitations, including poor effectiveness, insufficient generalization ability, and low comprehensiveness. In response, this paper introduces a novel defense framework named Model-Guardian. Comprising two components, Data-Free Model Stealing Detector (DFMS-Detector) and Deceptive Predictions (DPreds), Model-Guardian is designed to address the shortcomings of current defenses with the help of the artifact properties of synthetic samples and gradient representations of samples. Extensive experiments on seven prevalent data-free model stealing attacks showcase the effectiveness and superior generalization ability of Model-Guardian, outperforming eleven defense methods and establishing a new state-of-the-art performance. Notably, this work pioneers the utilization of various GANs and diffusion models for generating highly realistic query samples in attacks, with Model-Guardian demonstrating accurate detection capabilities.

artificial intelligence, gradient representation and deceptive prediction, machine learning, (3 more...)

arXiv.org Artificial Intelligence

2503.18081

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

On the rates of convergence for learning with convolutional neural networks

Yang, Yunfei, Feng, Han, Zhou, Ding-Xuan

arXiv.org Machine LearningApr-8-2024

We study approximation and learning capacities of convolutional neural networks (CNNs) with one-side zero-padding and multiple channels. Our first result proves a new approximation bound for CNNs with certain constraint on the weights. Our second result gives new analysis on the covering number of feed-forward neural networks with CNNs as special cases. The analysis carefully takes into account the size of the weights and hence gives better bounds than the existing literature in some situations. Using these two results, we are able to derive rates of convergence for estimators based on CNNs in many learning problems. In particular, we establish minimax optimal convergence rates of the least squares based on CNNs for learning smooth functions in the nonparametric regression setting. For binary classification, we derive convergence rates for CNN classifiers with hinge loss and logistic loss. It is also shown that the obtained rates for classification are minimax optimal in some common settings.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Machine Learning

2403.16459

Country: Asia > China > Hong Kong (0.14)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

Add feedback

Optimal rates of approximation by shallow ReLU$^k$ neural networks and applications to nonparametric regression

Yang, Yunfei, Zhou, Ding-Xuan

arXiv.org Machine LearningJan-8-2024

We study the approximation capacity of some variation spaces corresponding to shallow ReLU$^k$ neural networks. It is shown that sufficiently smooth functions are contained in these spaces with finite variation norms. For functions with less smoothness, the approximation rates in terms of the variation norm are established. Using these results, we are able to prove the optimal approximation rates in terms of the number of neurons for shallow ReLU$^k$ neural networks. It is also shown how these results can be used to derive approximation bounds for deep neural networks and convolutional neural networks (CNNs). As applications, we study convergence rates for nonparametric regression using three ReLU neural network models: shallow neural network, over-parameterized neural network, and CNN. In particular, we show that shallow neural networks can achieve the minimax optimal rates for learning H\"older functions, which complements recent results for deep neural networks. It is also proven that over-parameterized (deep or shallow) neural networks can achieve nearly optimal rates for nonparametric regression.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Machine Learning

2304.01561

Country: Asia > China > Hong Kong (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Universality and approximation bounds for echo state networks with random weights

Li, Zhen, Yang, Yunfei

arXiv.org Machine LearningDec-3-2023

We study the uniform approximation of echo state networks with randomly generated internal weights. These models, in which only the readout weights are optimized during training, have made empirical success in learning dynamical systems. Recent results showed that echo state networks with ReLU activation are universal. In this paper, we give an alternative construction and prove that the universality holds for general activation functions. Specifically, our main result shows that, under certain condition on the activation function, there exists a sampling procedure for the internal weights so that the echo state network can approximate any continuous casual time-invariant operators with high probability. In particular, for ReLU activation, we give explicit construction for these sampling procedures. We also quantify the approximation error of the constructed ReLU echo state networks for sufficiently regular operators.

artificial intelligence, machine learning, state network, (17 more...)

arXiv.org Machine Learning

2206.05669

Country:

Asia > China (0.28)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Nonparametric regression using over-parameterized shallow ReLU neural networks

Yang, Yunfei, Zhou, Ding-Xuan

arXiv.org Artificial IntelligenceJun-14-2023

It is shown that over-parameterized neural networks can achieve minimax optimal rates of convergence (up to logarithmic factors) for learning functions from certain smooth function classes, if the weights are suitably constrained or regularized. Specifically, we consider the nonparametric regression of estimating an unknown $d$-variate function by using shallow ReLU neural networks. It is assumed that the regression function is from the H\"older space with smoothness $\alpha<(d+3)/2$ or a variation space corresponding to shallow neural networks, which can be viewed as an infinitely wide neural network. In this setting, we prove that least squares estimators based on shallow neural networks with certain norm constraints on the weights are minimax optimal, if the network width is sufficiently large. As a byproduct, we derive a new size-independent bound for the local Rademacher complexity of shallow ReLU neural networks, which may be of independent interest.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2306.08321

Country:

Asia > China > Hong Kong (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback

Convergence Analysis of the Deep Galerkin Method for Weak Solutions

Jiao, Yuling, Lai, Yanming, Wang, Yang, Yang, Haizhao, Yang, Yunfei

arXiv.org Artificial IntelligenceFeb-5-2023

This paper analyzes the convergence rate of a deep Galerkin method for the weak solution (DGMW) of second-order elliptic partial differential equations on $\mathbb{R}^d$ with Dirichlet, Neumann, and Robin boundary conditions, respectively. In DGMW, a deep neural network is applied to parametrize the PDE solution, and a second neural network is adopted to parametrize the test function in the traditional Galerkin formulation. By properly choosing the depth and width of these two networks in terms of the number of training samples $n$, it is shown that the convergence rate of DGMW is $\mathcal{O}(n^{-1/d})$, which is the first convergence result for weak solutions. The main idea of the proof is to divide the error of the DGMW into an approximation error and a statistical error. We derive an upper bound on the approximation error in the $H^{1}$ norm and bound the statistical error via Rademacher complexity.

artificial intelligence, machine learning, neural network, (15 more...)

arXiv.org Artificial Intelligence

2302.02405

Country:

Asia (0.68)
North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Approximation bounds for norm constrained neural networks with applications to regression and GANs

Jiao, Yuling, Wang, Yang, Yang, Yunfei

arXiv.org Machine LearningJan-23-2022

This paper studies the approximation capacity of ReLU neural networks with norm constraint on the weights. We prove upper and lower bounds on the approximation error of these networks for smooth function classes. The lower bound is derived through the Rademacher complexity of neural networks, which may be of independent interest. We apply these approximation bounds to analyze the convergence of regression using norm constrained neural networks and distribution estimation by GANs. In particular, we obtain convergence rates for over-parameterized neural networks. It is also shown that GANs can achieve optimal rate of learning probability distributions, when the discriminator is a properly chosen norm constrained neural network.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2201.09418

Country:

Asia > China > Hong Kong (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Non-Asymptotic Error Bounds for Bidirectional GANs

Liu, Shiao, Yang, Yunfei, Huang, Jian, Jiao, Yuling, Wang, Yang

arXiv.org Machine LearningOct-23-2021

We derive nearly sharp bounds for the bidirectional GAN (BiGAN) estimation error under the Dudley distance between the latent joint distribution and the data joint distribution with appropriately specified architecture of the neural networks used in the model. To the best of our knowledge, this is the first theoretical guarantee for the bidirectional GAN learning approach. An appealing feature of our results is that they do not assume the reference and the data distributions to have the same dimensions or these distributions to have bounded support. These assumptions are commonly assumed in the existing convergence analysis of the unidirectional GANs but may not be satisfied in practice. Our results are also applicable to the Wasserstein bidirectional GAN if the target distribution is assumed to have a bounded support. To prove these results, we construct neural network functions that push forward an empirical distribution to another arbitrary empirical distribution on a possibly different-dimensional space. We also develop a novel decomposition of the integral probability metric for the error analysis of bidirectional GANs. These basic theoretical results are of independent interest and can be applied to other related learning problems.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Machine Learning

2110.12319

Country:

Asia > China (0.46)
North America > United States > Iowa > Johnson County > Iowa City (0.14)

Genre: Research Report > New Finding (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

An error analysis of generative adversarial networks for learning distributions

Huang, Jian, Jiao, Yuling, Li, Zhen, Liu, Shiao, Wang, Yang, Yang, Yunfei

arXiv.org Machine LearningJun-11-2021

This paper studies how well generative adversarial networks (GANs) learn probability distributions from finite samples. Our main results establish the convergence rates of GANs under a collection of integral probability metrics defined through H\"older classes, including the Wasserstein distance as a special case. We also show that GANs are able to adaptively learn data distributions with low-dimensional structures or have H\"older densities, when the network architectures are chosen properly. In particular, for distributions concentrated around a low-dimensional set, we show that the learning rates of GANs do not depend on the high ambient dimension, but on the lower intrinsic dimension. Our analysis is based on a new oracle inequality decomposing the estimation error into the generator and discriminator approximation error and the statistical error, which may be of independent interest.

approximation error, artificial intelligence, neural network, (14 more...)

arXiv.org Machine Learning

2105.1301

Country:

Asia > China (0.92)
North America > United States > Iowa > Johnson County > Iowa City (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

On the capacity of deep generative networks for approximating distributions

Yang, Yunfei, Li, Zhen, Wang, Yang

arXiv.org Machine LearningJan-28-2021

We study the efficacy and efficiency of deep generative networks for approximating probability distributions. We prove that neural networks can transform a one-dimensional source distribution to a distribution that is arbitrarily close to a high-dimensional target distribution in Wasserstein distances. Upper bounds of the approximation error are obtained in terms of neural networks' width and depth. It is shown that the approximation error grows at most linearly on the ambient dimension and that the approximation order only depends on the intrinsic dimension of the target distribution. On the contrary, when $f$-divergences are used as metrics of distributions, the approximation property is different. We prove that in order to approximate the target distribution in $f$-divergences, the dimension of the source distribution cannot be smaller than the intrinsic dimension of the target distribution. Therefore, $f$-divergences are less adequate than Waserstein distances as metrics of distributions for generating samples.

artificial intelligence, dimension, neural network, (16 more...)

arXiv.org Machine Learning

2101.12353

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback