AITopics | Zhou, Kuangqi

Collaborating Authors

Zhou, Kuangqi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning

Shi, Yujun, Zhou, Kuangqi, Liang, Jian, Jiang, Zihang, Feng, Jiashi, Torr, Philip, Bai, Song, Tan, Vincent Y. F.

arXiv.org Artificial IntelligenceApr-7-2024

Class Incremental Learning (CIL) aims at learning a multi-class classifier in a phase-by-phase manner, in which only data of a subset of the classes are provided at each phase. Previous works mainly focus on mitigating forgetting in phases after the initial one. However, we find that improving CIL at its initial phase is also a promising direction. Specifically, we experimentally show that directly encouraging CIL Learner at the initial phase to output similar representations as the model jointly trained on all classes can greatly boost the CIL performance. Motivated by this, we study the difference between a na\"ively-trained initial-phase model and the oracle model. Specifically, since one major difference between these two models is the number of training classes, we investigate how such difference affects the model representations. We find that, with fewer training classes, the data representations of each class lie in a long and narrow region; with more training classes, the representations of each class scatter more uniformly. Inspired by this observation, we propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly, thus mimicking the model jointly trained with all classes (i.e., the oracle model). Our CwD is simple to implement and easy to plug into existing methods. Extensive experiments on various benchmark datasets show that CwD consistently and significantly improves the performance of existing state-of-the-art methods by around 1\% to 3\%. Code will be released.

artificial intelligence, machine learning, representation, (14 more...)

arXiv.org Artificial Intelligence

2112.04731

Country:

Asia > Singapore (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Towards Better Laplacian Representation in Reinforcement Learning with Generalized Graph Drawing

Wang, Kaixin, Zhou, Kuangqi, Zhang, Qixin, Shao, Jie, Hooi, Bryan, Feng, Jiashi

arXiv.org Artificial IntelligenceJul-12-2021

The Laplacian representation recently gains increasing attention for reinforcement learning as it provides succinct and informative representation for states, by taking the eigenvectors of the Laplacian matrix of the state-transition graph as state embeddings. Such representation captures the geometry of the underlying state space and is beneficial to RL tasks such as option discovery and reward shaping. To approximate the Laplacian representation in large (or even continuous) state spaces, recent works propose to minimize a spectral graph drawing objective, which however has infinitely many global minimizers other than the eigenvectors. As a result, their learned Laplacian representation may differ from the ground truth. To solve this problem, we reformulate the graph drawing objective into a generalized form and derive a new learning objective, which is proved to have eigenvectors as its unique global minimizer. It enables learning high-quality Laplacian representations that faithfully approximate the ground truth. We validate this via comprehensive experiments on a set of gridworld and continuous control environments. Moreover, we show that our learned Laplacian representations lead to more exploratory options and better reward shaping.

artificial intelligence, reinforcement learning, representation, (14 more...)

arXiv.org Artificial Intelligence

2107.05545

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Effective Training Strategies for Deep Graph Neural Networks

Zhou, Kuangqi, Dong, Yanfei, Lee, Wee Sun, Hooi, Bryan, Xu, Huan, Feng, Jiashi

arXiv.org Machine LearningJun-12-2020

Graph Neural Networks (GNNs) tend to suffer performance degradation as model depth increases, which is usually attributed in previous works to the oversmoothing problem. However, we find that although oversmoothing is a contributing factor, the main reasons for this phenomenon are training difficulty and overfitting, which we study by experimentally investigating Graph Convolutional Networks (GCNs), a representative GNN architecture. We find that training difficulty is caused by gradient vanishing and can be solved by adding residual connections. More importantly, overfitting is the major obstacle for deep GCNs and cannot be effectively solved by existing regularization techniques. Deep GCNs also suffer training instability, which slows down the training process. To address overfitting and training instability, we propose Node Normalization (NodeNorm), which normalizes each node using its own statistics in model training. The proposed NodeNorm regularizes deep GCNs by discouraging feature-wise correlation of hidden embeddings and increasing model smoothness with respect to input node features, and thus effectively reduces overfitting. Additionally, it stabilizes the training process and hence speeds up the training. Extensive experiments demonstrate that our NodeNorm method generalizes well to other GNN architectures, enabling deep GNNs to compete with and even outperform shallow ones.

artificial intelligence, neural network, nodenorm, (15 more...)

arXiv.org Machine Learning

2006.07107

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback