AITopics | He, Juncai

Plotting

He, Juncai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Linear Regression on Manifold Structured Data: the Impact of Extrinsic Geometry on Solutions

Liu, Liangchen, He, Juncai, Tsai, Richard

arXiv.org Artificial IntelligenceJul-22-2023

In this paper, we study linear regression applied to data structured on a manifold. We assume that the data manifold is smooth and is embedded in a Euclidean space, and our objective is to reveal the impact of the data manifold's extrinsic geometry on the regression. Specifically, we analyze the impact of the manifold's curvatures (or higher order nonlinearity in the parameterization when the curvatures are locally zero) on the uniqueness of the regression solution. Our findings suggest that the corresponding linear regression does not have a unique solution when the embedded submanifold is flat in some dimensions. Otherwise, the manifold's curvature (or higher order nonlinearity in the embedding) may contribute significantly, particularly in the solution associated with the normal directions of the manifold. Our findings thus reveal the role of data manifold geometry in ensuring the stability of regression models for out-of-distribution inferences.

artificial intelligence, machine learning, manifold, (14 more...)

arXiv.org Artificial Intelligence

2307.02478

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Side Effects of Learning from Low-dimensional Data Embedded in a Euclidean Space

He, Juncai, Tsai, Richard, Ward, Rachel

arXiv.org Artificial IntelligenceFeb-4-2023

The low-dimensional manifold hypothesis posits that the data found in many applications, such as those involving natural images, lie (approximately) on low-dimensional manifolds embedded in a high-dimensional Euclidean space. In this setting, a typical neural network defines a function that takes a finite number of vectors in the embedding space as input. However, one often needs to consider evaluating the optimized network at points outside the training distribution. This paper considers the case in which the training data is distributed in a linear subspace of $\mathbb R^d$. We derive estimates on the variation of the learning function, defined by a neural network, in the direction transversal to the subspace. We study the potential regularization effects associated with the network's depth and noise in the codimension of the data manifold. We also present additional side effects in training due to the presence of noise.

artificial intelligence, machine learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

2203.00614

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.50)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

FV-MgNet: Fully Connected V-cycle MgNet for Interpretable Time Series Forecasting

Zhu, Jianqing, He, Juncai, Zhang, Lian, Xu, Jinchao

arXiv.org Artificial IntelligenceFeb-2-2023

By investigating iterative methods for a constrained linear model, we propose a new class of fully connected V-cycle MgNet for long-term time series forecasting, which is one of the most difficult tasks in forecasting. MgNet is a CNN model that was proposed for image classification based on the multigrid (MG) methods for solving discretized partial differential equations (PDEs). We replace the convolutional operations with fully connected operations in the existing MgNet and then apply them to forecasting problems. Motivated by the V-cycle structure in MG, we further propose the FV-MgNet, a V-cycle version of the fully connected MgNet, to extract features hierarchically. By evaluating the performance of FV-MgNet on popular data sets and comparing it with state-of-the-art models, we show that the FV-MgNet achieves better results with less memory usage and faster inference speed. In addition, we develop ablation experiments to demonstrate that the structure of FV-MgNet is the best choice among the many variants.

data mining, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2302.00962

Country: North America > United States (1.00)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

An Enhanced V-cycle MgNet Model for Operator Learning in Numerical Partial Differential Equations

Zhu, Jianqing, He, Juncai, Huang, Qiumei

arXiv.org Artificial IntelligenceFeb-2-2023

This study used a multigrid-based convolutional neural network architecture known as MgNet in operator learning to solve numerical partial differential equations (PDEs). Given the property of smoothing iterations in multigrid methods where low-frequency errors decay slowly, we introduced a low-frequency correction structure for residuals to enhance the standard V-cycle MgNet. The enhanced MgNet model can capture the low-frequency features of solutions considerably better than the standard V-cycle MgNet. The numerical results obtained using some standard operator learning tasks are better than those obtained using many state-of-the-art methods, demonstrating the efficiency of our model.Moreover, numerically, our new model is more robust in case of low- and high-resolution data during training and testing, respectively.

artificial intelligence, machine learning, operator, (16 more...)

arXiv.org Artificial Intelligence

2302.00938

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An Interpretive Constrained Linear Model for ResNet and MgNet

He, Juncai, Xu, Jinchao, Zhang, Lian, Zhu, Jianqing

arXiv.org Artificial IntelligenceDec-14-2021

We propose a constrained linear data-feature-mapping model as an interpretable mathematical model for image classification using a convolutional neural network (CNN). From this viewpoint, we establish detailed connections between the traditional iterative schemes for linear systems and the architectures of the basic blocks of ResNet- and MgNet-type models. Using these connections, we present some modified ResNet models that compared with the original models have fewer parameters and yet can produce more accurate results, thereby demonstrating the validity of this constrained learning data-feature-mapping assumption. Based on this assumption, we further propose a general data-feature iterative scheme to show the rationality of MgNet. We also provide a systematic numerical study on MgNet to show its success and advantages in image classification problems and demonstrate its advantages in comparison with established networks.

artificial intelligence, machine learning, mgnet, (19 more...)

arXiv.org Artificial Intelligence

2112.07441

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Weight Initialization Based on the Linear Product Structure for Neural Networks

Chen, Qipin, Hao, Wenrui, He, Juncai

arXiv.org Artificial IntelligenceOct-5-2021

Weight initialization plays an important role in training neural networks and also affects tremendous deep learning applications. Various weight initialization strategies have already been developed for different activation functions with different neural networks. These initialization algorithms are based on minimizing the variance of the parameters between layers and might still fail when neural networks are deep, e.g., dying ReLU. To address this challenge, we study neural networks from a nonlinear computation point of view and propose a novel weight initialization strategy that is based on the linear product structure (LPS) of neural networks. The proposed strategy is derived from the polynomial approximation of activation functions by using theories of numerical algebraic geometry to guarantee to find all the local minima. We also provide a theoretical analysis that the LPS initialization has a lower probability of dying ReLU comparing to other existing initialization strategies. Finally, we test the LPS initialization algorithm on both fully connected neural networks and convolutional neural networks to show its feasibility, efficiency, and robustness on public datasets.

artificial intelligence, initialization, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.amc.2021.126722

2109.00125

Country: North America > United States > Texas (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Approximation Properties of Deep ReLU CNNs

He, Juncai, Li, Lin, Xu, Jinchao

arXiv.org Machine LearningSep-1-2021

This paper is devoted to establishing $L^2$ approximation properties for deep ReLU convolutional neural networks (CNNs) on two-dimensional space. The analysis is based on a decomposition theorem for convolutional kernels with large spatial size and multi-channel. Given that decomposition and the property of the ReLU activation function, a universal approximation theorem of deep ReLU CNNs with classic structure is obtained by showing its connection with ReLU deep neural networks (DNNs) with one hidden layer. Furthermore, approximation properties are also obtained for neural networks with ResNet, pre-act ResNet, and MgNet architecture based on connections between these networks.

artificial intelligence, deep learning, neural network, (15 more...)

arXiv.org Machine Learning

2109.0019

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MgNet: A Unified Framework of Multigrid and Convolutional Neural Network

He, Juncai, Xu, Jinchao

arXiv.org Artificial IntelligenceMay-1-2019

We develop a unified model, known as MgNet, that simultaneously recovers some convolutional neural networks (CNN) for image classification and multigrid (MG) methods for solving discretized partial differential equations (PDEs). This model is based on close connections that we have observed and uncovered between the CNN and MG methodologies. For example, pooling operation and feature extraction in CNN correspond directly to restriction operation and iterative smoothers in MG, respectively. As the solution space is often the dual of the data space in PDEs, the analogous concept of feature space and data space (which are dual to each other) is introduced in CNN. With such connections and new concept in the unified model, the function of various convolution operations and pooling used in CNN can be better understood. As a result, modified CNN models (with fewer weights and hyper parameters) are developed that exhibit competitive and sometimes better performance in comparison with existing CNN models when applied to both CIFAR-10 and CIFAR-100 data sets.

artificial intelligence, machine learning, neural network, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11425-019-9547-2

1901.10415

Country: North America > United States (0.93)

Genre: Research Report (0.64)

Industry: Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Modified Regularized Dual Averaging Method for Training Sparse Convolutional Neural Networks

Jia, Xiaodong, Zhao, Liang, Zhang, Lian, He, Juncai, Xu, Jinchao

arXiv.org Machine LearningJul-11-2018

We proposed a modified regularized dual averaging method for training sparse deep convolutional neural networks. The regularized dual averaging method has been proven to be effective in obtaining sparse solutions in convex optimization problems, but not applied to deep learning fields before. We analyzed the new version in convex conditions and prove the convergence of it. The modified method can obtain more sparse solutions than traditional sparse optimization methods such as proximal-SGD, while keeping almost the same accuracy as stochastic gradient method with momentum on certain datasets.

dataset, deep learning, neural network, (15 more...)

arXiv.org Machine Learning

1807.04222

Country:

Asia > China > Hong Kong (0.14)
North America > United States > Pennsylvania (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback