AITopics | Chen, Chun

Collaborating Authors

Chen, Chun

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

OpenGSL: A Comprehensive Benchmark for Graph Structure Learning

Zhou, Zhiyao, Zhou, Sheng, Mao, Bochao, Zhou, Xuanyi, Chen, Jiawei, Tan, Qiaoyu, Zha, Daochen, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceDec-23-2023

Graph Neural Networks (GNNs) have emerged as the de facto standard for representation learning on graphs, owing to their ability to effectively integrate graph topology and node attributes. However, the inherent suboptimal nature of node connections, resulting from the complex and contingent formation process of graphs, presents significant challenges in modeling them effectively. To tackle this issue, Graph Structure Learning (GSL), a family of data-centric learning approaches, has garnered substantial attention in recent years. The core concept behind GSL is to jointly optimize the graph structure and the corresponding GNN models. Despite the proposal of numerous GSL methods, the progress in this field remains unclear due to inconsistent experimental protocols, including variations in datasets, data processing techniques, and splitting strategies. In this paper, we introduce OpenGSL, the first comprehensive benchmark for GSL, aimed at addressing this gap. OpenGSL enables a fair comparison among state-of-the-art GSL methods by evaluating them across various popular datasets using uniform data processing and splitting strategies. Through extensive experiments, we observe that existing GSL methods do not consistently outperform vanilla GNN counterparts. We also find that there is no significant correlation between the homophily of the learned structure and task performance, challenging the common belief. Moreover, we observe that the learned graph structure demonstrates a strong generalization ability across different GNN models, despite the high computational and space consumption. We hope that our open-sourced library will facilitate rapid and equitable evaluation and inspire further innovative research in this field. The code of the benchmark can be found in https://github.com/OpenGSL/OpenGSL.

artificial intelligence, gsl method, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2306.1028

Country: Asia > China > Zhejiang Province (0.14)

Genre: Research Report > New Finding (0.93)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.63)

Add feedback

Fast ODE-based Sampling for Diffusion Models in Around 5 Steps

Zhou, Zhenyu, Chen, Defang, Wang, Can, Chen, Chun

arXiv.org Artificial IntelligenceNov-30-2023

Sampling from diffusion models can be treated as solving the corresponding ordinary differential equations (ODEs), with the aim of obtaining an accurate solution with as few number of function evaluations (NFE) as possible. Recently, various fast samplers utilizing higher-order ODE solvers have emerged and achieved better performance than the initial first-order one. However, these numerical methods inherently result in certain approximation errors, which significantly degrades sample quality with extremely small NFE (e.g., around 5). In contrast, based on the geometric observation that each sampling trajectory almost lies in a two-dimensional subspace embedded in the ambient space, we propose Approximate MEan-Direction Solver (AMED-Solver) that eliminates truncation errors by directly learning the mean direction for fast diffusion sampling. Besides, our method can be easily used as a plugin to further improve existing ODE-based samplers. Extensive experiments on image synthesis with the resolution ranging from 32 to 256 demonstrate the effectiveness of our method. With only 5 NFE, we achieve 7.14 FID on CIFAR-10, 13.75 FID on ImageNet 64$\times$64, and 12.79 FID on LSUN Bedroom. Our code is available at https://github.com/zhyzhouu/amed-solver.

artificial intelligence, machine learning, solver, (17 more...)

arXiv.org Artificial Intelligence

2312.00094

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Industry:

Education (0.46)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Geometric Perspective on Diffusion Models

Chen, Defang, Zhou, Zhenyu, Mei, Jian-Ping, Shen, Chunhua, Chen, Chun, Wang, Can

arXiv.org Machine LearningSep-30-2023

Recent years have witnessed significant progress in developing effective training and fast sampling techniques for diffusion models. A remarkable advancement is the use of stochastic differential equations (SDEs) and their marginal-preserving ordinary differential equations (ODEs) to describe data perturbation and generative modeling in a unified framework. In this paper, we carefully inspect the ODE-based sampling of a popular variance-exploding SDE and reveal several intriguing structures of its sampling dynamics. We discover that the data distribution and the noise distribution are smoothly connected with a quasi-linear sampling trajectory and another implicit denoising trajectory that even converges faster. Meanwhile, the denoising trajectory governs the curvature of the corresponding sampling trajectory and its various finite differences yield all second-order samplers used in practice. Furthermore, we establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm, with which we can characterize the asymptotic behavior of diffusion models and identify the empirical score deviation.

artificial intelligence, machine learning, trajectory, (16 more...)

arXiv.org Machine Learning

2305.19947

Country:

Europe (0.14)
North America (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CDR: Conservative Doubly Robust Learning for Debiased Recommendation

Song, ZiJie, Chen, JiaWei, Zhou, Sheng, Shi, QiHao, Feng, Yan, Chen, Chun, Wang, Can

arXiv.org Artificial IntelligenceAug-17-2023

In recommendation systems (RS), user behavior data is observational rather than experimental, resulting in widespread bias in the data. Consequently, tackling bias has emerged as a major challenge in the field of recommendation systems. Recently, Doubly Robust Learning (DR) has gained significant attention due to its remarkable performance and robust properties. However, our experimental findings indicate that existing DR methods are severely impacted by the presence of so-called Poisonous Imputation, where the imputation significantly deviates from the truth and becomes counterproductive. To address this issue, this work proposes Conservative Doubly Robust strategy (CDR) which filters imputations by scrutinizing their mean and variance. Theoretical analyses show that CDR offers reduced variance and improved tail bounds.In addition, our experimental investigations illustrate that CDR significantly enhances performance and can indeed reduce the frequency of poisonous imputation.

artificial intelligence, imputation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2308.08461

Country:

Europe (0.48)
Asia > China > Zhejiang Province (0.15)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Graph Neural Networks-based Hybrid Framework For Predicting Particle Crushing Strength

Zheng, Tongya, Zhang, Tianli, Guan, Qingzheng, Huang, Wenjie, Feng, Zunlei, Song, Mingli, Chen, Chun

arXiv.org Artificial IntelligenceJul-25-2023

Graph Neural Networks have emerged as an effective machine learning tool for multi-disciplinary tasks such as pharmaceutical molecule classification and chemical reaction prediction, because they can model non-euclidean relationships between different entities. Particle crushing, as a significant field of civil engineering, describes the breakage of granular materials caused by the breakage of particle fragment bonds under the modeling of numerical simulations, which motivates us to characterize the mechanical behaviors of particle crushing through the connectivity of particle fragments with Graph Neural Networks (GNNs). However, there lacks an open-source large-scale particle crushing dataset for research due to the expensive costs of laboratory tests or numerical simulations. Therefore, we firstly generate a dataset with 45,000 numerical simulations and 900 particle types to facilitate the research progress of machine learning for particle crushing. Secondly, we devise a hybrid framework based on GNNs to predict particle crushing strength in a particle fragment view with the advances of state of the art GNNs. Finally, we compare our hybrid framework against traditional machine learning methods and the plain MLP to verify its effectiveness. The usefulness of different features is further discussed through the gradient attribution explanation w.r.t the predictions. Our data and code are released at https://github.com/doujiang-zheng/GNN-For-Particle-Crushing.

artificial intelligence, machine learning, particle, (18 more...)

arXiv.org Artificial Intelligence

2307.13909

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Temporal Aggregation and Propagation Graph Neural Networks for Dynamic Representation

Zheng, Tongya, Wang, Xinchao, Feng, Zunlei, Song, Jie, Hao, Yunzhi, Song, Mingli, Wang, Xingen, Wang, Xinyu, Chen, Chun

arXiv.org Artificial IntelligenceApr-15-2023

Temporal graphs exhibit dynamic interactions between nodes over continuous time, whose topologies evolve with time elapsing. The whole temporal neighborhood of nodes reveals the varying preferences of nodes. However, previous works usually generate dynamic representation with limited neighbors for simplicity, which results in both inferior performance and high latency of online inference. Therefore, in this paper, we propose a novel method of temporal graph convolution with the whole neighborhood, namely Temporal Aggregation and Propagation Graph Neural Networks (TAP-GNN). Specifically, we firstly analyze the computational complexity of the dynamic representation problem by unfolding the temporal graph in a message-passing paradigm. The expensive complexity motivates us to design the AP (aggregation and propagation) block, which significantly reduces the repeated computation of historical neighbors. The final TAP-GNN supports online inference in the graph stream scenario, which incorporates the temporal information into node embeddings with a temporal activation function and a projection layer besides several AP blocks. Experimental results on various real-life temporal networks show that our proposed TAP-GNN outperforms existing temporal graph methods by a large margin in terms of both predictive performance and online inference latency. Our code is available at \url{https://github.com/doujiang-zheng/TAP-GNN}.

artificial intelligence, machine learning, node, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TKDE.2023.3265271

2304.07503

Country:

Asia > China (0.46)
North America > United States > Illinois (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Services (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Robust Sequence Networked Submodular Maximization

Shi, Qihao, Fu, Bingyang, Wang, Can, Chen, Jiawei, Zhou, Sheng, Feng, Yan, Chen, Chun

arXiv.org Artificial IntelligenceJan-26-2023

In this paper, we study the \underline{R}obust \underline{o}ptimization for \underline{se}quence \underline{Net}worked \underline{s}ubmodular maximization (RoseNets) problem. We interweave the robust optimization with the sequence networked submodular maximization. The elements are connected by a directed acyclic graph and the objective function is not submodular on the elements but on the edges in the graph. Under such networked submodular scenario, the impact of removing an element from a sequence depends both on its position in the sequence and in the network. This makes the existing robust algorithms inapplicable. In this paper, we take the first step to study the RoseNets problem. We design a robust greedy algorithm, which is robust against the removal of an arbitrary subset of the selected elements. The approximation ratio of the algorithm depends both on the number of the removed elements and the network topology. We further conduct experiments on real applications of recommendation and link prediction. The experimental results demonstrate the effectiveness of the proposed algorithm.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2212.13725

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.35)

Add feedback

Online Adversarial Distillation for Graph Neural Networks

Wang, Can, Wang, Zhe, Chen, Defang, Zhou, Sheng, Feng, Yan, Chen, Chun

arXiv.org Artificial IntelligenceDec-27-2021

Knowledge distillation has recently become a popular technique to improve the model generalization ability on convolutional neural networks. However, its effect on graph neural networks is less than satisfactory since the graph topology and node attributes are likely to change in a dynamic way and in this case a static teacher model is insufficient in guiding student training. In this paper, we tackle this challenge by simultaneously training a group of graph neural networks in an online distillation fashion, where the group knowledge plays a role as a dynamic virtual teacher and the structure changes in graph neural networks are effectively captured. To improve the distillation performance, two types of knowledge are transferred among the students to enhance each other: local knowledge reflecting information in the graph topology and node attributes, and global knowledge reflecting the prediction over classes. We transfer the global knowledge with KL-divergence as the vanilla knowledge distillation does, while exploiting the complicated structure of the local knowledge with an efficient adversarial cyclic learning framework. Extensive experiments verified the effectiveness of our proposed online adversarial distillation approach.

artificial intelligence, knowledge, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2112.13966

Country: Asia > China (0.14)

Genre: Research Report (0.82)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Exploring the Connection between Knowledge Distillation and Logits Matching

Chen, Defang, Wang, Can, Feng, Yan, Chen, Chun

arXiv.org Artificial IntelligenceSep-14-2021

Knowledge distillation is a generalized logits matching technique for model compression. Their equivalence is previously established on the condition of $\textit{infinity temperature}$ and $\textit{zero-mean normalization}$. In this paper, we prove that with only $\textit{infinity temperature}$, the effect of knowledge distillation equals to logits matching with an extra regularization. Furthermore, we reveal that an additional weaker condition -- $\textit{equal-mean initialization}$ rather than the original $\textit{zero-mean normalization}$ already suffices to set up the equivalence. The key to our proof is we realize that in modern neural networks with the cross-entropy loss and softmax activation, the mean of back-propagated gradient on logits always keeps zero.

artificial intelligence, knowledge distillation and logit matching, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2109.06458

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Cross-Layer Distillation with Semantic Calibration

Chen, Defang, Mei, Jian-Ping, Zhang, Yuan, Wang, Can, Wang, Zhe, Feng, Yan, Chen, Chun

arXiv.org Artificial IntelligenceDec-6-2020

Recently proposed knowledge distillation approaches based on feature-map transfer validate that intermediate layers of a teacher model can serve as effective targets for training a student model to obtain better generalization ability. Existing studies mainly focus on particular representation forms for knowledge transfer between manually specified pairs of teacher-student intermediate layers. However, semantics of intermediate layers may vary in different networks and manual association of layers might lead to negative regularization caused by semantic mismatch between certain teacher-student layer pairs. To address this problem, we propose Semantic Calibration for Cross-layer Knowledge Distillation (SemCKD), which automatically assigns proper target layers of the teacher model for each student layer with an attention mechanism. With a learned attention distribution, each student layer distills knowledge contained in multiple layers rather than a single fixed intermediate layer from the teacher model for appropriate cross-layer supervision in training. Consistent improvements over state-of-the-art approaches are observed in extensive experiments with various network architectures for teacher and student models, demonstrating the effectiveness and flexibility of the proposed attention based soft layer association mechanism for cross-layer distillation.

artificial intelligence, distillation, neural network, (18 more...)

arXiv.org Artificial Intelligence

2012.03236

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback