AITopics | Lan, Yuan

Collaborating Authors

Lan, Yuan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Characteristic Guidance: Non-linear Correction for Diffusion Model at Large Guidance Scale

Zheng, Candi, Lan, Yuan

arXiv.org Artificial IntelligenceFeb-1-2024

Popular guidance for denoising diffusion probabilistic model (DDPM) linearly combines distinct conditional models together to provide enhanced control over samples. However, this approach overlooks nonlinear effects that become significant when guidance scale is large. To address this issue, we propose characteristic guidance, a guidance method that provides first-principle non-linear correction for classifier-free guidance. Such correction forces the guided DDPMs to respect the Fokker-Planck (FP) equation of diffusion process, in a way that is training-free and compatible with existing sampling methods. Experiments show that characteristic guidance enhances semantic characteristics of prompts and mitigate irregularities in image generation, proving effective in diverse applications ranging from simulating magnet phase transitions to latent space sampling.

artificial intelligence, guidance, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2312.07586

Country: Asia > China (0.28)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Energy stable neural network for gradient flow equations

Fan, Ganghua, Jin, Tianyu, Lan, Yuan, Xiang, Yang, Zhang, Luchan

arXiv.org Artificial IntelligenceSep-17-2023

Partial differential equations are important tools in solving a wide range of problems in science and engineering fields. Over the past twenty years, deep neural networks (DNNs) [12, 19] have demonstrated their power in science and engineering applications, and efforts have been made to employ DNNs to solve complex partial differential equations as an alternative to the traditional numerical schemes, especially for problems in high dimensions. Early works [5, 17] use feedforward neural network to learn the initial/boundary value problem by constraining neural networks using differential equation. Methods using continuous dynamical systems to model high-dimensional nonlinear functions used in machine learning were proposed in [6]. A deep learning-based approach to solve high dimensional parabolic partial differential equations (PDEs) based on the formulation of stochastic differential equations was developed in [14].

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2309.10002

Country:

Asia > China (0.29)
North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas > Upstream (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large Transformers are Better EEG Learners

Wang, Bingxin, Fu, Xiaowen, Lan, Yuan, Zhang, Luchan, Xiang, Yang

arXiv.org Artificial IntelligenceAug-20-2023

Pre-trained large transformer models have achieved remarkable performance in the fields of natural language processing and computer vision. Since the magnitude of available labeled electroencephalogram (EEG) data is much lower than that of text and image data, it is difficult for transformer models pre-trained from EEG to be developed as large as GPT-4 100T to fully unleash the potential of this architecture. In this paper, we show that transformers pre-trained from images as well as text can be directly fine-tuned for EEG-based prediction tasks. We design AdaCE, plug-and-play Adapters for Converting EEG data into image as well as text forms, to fine-tune pre-trained vision and language transformers. The proposed AdaCE module is highly effective for fine-tuning pre-trained transformers while achieving state-of-the-art performance on diverse EEG-based prediction tasks. For example, AdaCE on the pre-trained Swin-Transformer achieves 99.6%, an absolute improvement of 9.2%, on the EEG-decoding task of human activity recognition (UCI HAR). Furthermore, we empirically show that applying the proposed AdaCE to fine-tune larger pre-trained models can achieve better performance on EEG-based predicting tasks, indicating the potential of our adapters for even larger transformers. The plug-and-play AdaCE module can be applied to fine-tuning most of the popular pre-trained transformers on many other time-series data with multiple channels, not limited to EEG data and the models we use. Our code will be available at https://github.com/wangbxj1234/AdaCE.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2308.11654

Country: Asia > China (0.29)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DOSnet as a Non-Black-Box PDE Solver: When Deep Learning Meets Operator Splitting

Lan, Yuan, Li, Zhen, Sun, Jie, Xiang, Yang

arXiv.org Artificial IntelligenceDec-11-2022

Deep neural networks (DNNs) recently emerged as a promising tool for analyzing and solving complex differential equations arising in science and engineering applications. Alternative to traditional numerical schemes, learning-based solvers utilize the representation power of DNNs to approximate the input-output relations in an automated manner. However, the lack of physics-in-the-loop often makes it difficult to construct a neural network solver that simultaneously achieves high accuracy, low computational burden, and interpretability. In this work, focusing on a class of evolutionary PDEs characterized by having decomposable operators, we show that the classical ``operator splitting'' numerical scheme of solving these equations can be exploited to design neural network architectures. This gives rise to a learning-based PDE solver, which we name Deep Operator-Splitting Network (DOSnet). Such non-black-box network design is constructed from the physical rules and operators governing the underlying dynamics contains learnable parameters, and is thus more flexible than the standard operator splitting scheme. Once trained, it enables the fast solution of the same type of PDEs. To validate the special structure inside DOSnet, we take the linear PDEs as the benchmark and give the mathematical explanation for the weight behavior. Furthermore, to demonstrate the advantages of our new AI-enhanced PDE solver, we train and validate it on several types of operator-decomposable differential equations. We also apply DOSnet to nonlinear Schr\"odinger equations (NLSE) which have important applications in the signal processing for modern optical fiber transmission systems, and experimental results show that our model has better accuracy and lower computational complexity than numerical schemes and the baseline DNNs.

artificial intelligence, dosnet, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.jcp.2023.112343

2212.05571

Country:

Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.88)

Industry:

Energy (0.67)
Transportation > Air (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Feature Flow Regularization: Improving Structured Sparsity in Deep Neural Networks

Wu, Yue, Lan, Yuan, Zhang, Luchan, Xiang, Yang

arXiv.org Artificial IntelligenceOct-7-2021

Pruning is a model compression method that removes redundant parameters in deep neural networks (DNNs) while maintaining accuracy. Most available filter pruning methods require complex treatments such as iterative pruning, features statistics/ranking, or additional optimization designs in the training process. In this paper, we propose a simple and effective regularization strategy from a new perspective of evolution of features, which we call feature flow regularization (FFR), for improving structured sparsity and filter pruning in DNNs. Specifically, FFR imposes controls on the gradient and curvature of feature flow along the neural network, which implicitly increases the sparsity of the parameters. The principle behind FFR is that coherent and smooth evolution of features will lead to an efficient network that avoids redundant parameters. The high structured sparsity obtained from FFR enables us to prune filters effectively. Experiments with VGGNets, ResNets on CIFAR-10/100, and Tiny ImageNet datasets demonstrate that FFR can significantly improve both unstructured and structured sparsity. Our pruning results in terms of reduction of parameters and FLOPs are comparable to or even better than those of state-of-the-art pruning methods.

artificial intelligence, ffr, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.neunet.2023.02.013

2106.02914

Country: North America > United States (0.46)

Genre: Research Report (0.83)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback