Xiao, Xi
Neural Architecture Searching for Facial Attributes-based Depression Recognition
Chen, Mingzhe, Xiao, Xi, Zhang, Bin, Liu, Xinyu, Lu, Runiu
Recent studies show that depression can be partially reflected from human facial attributes. Since facial attributes have various data structure and carry different information, existing approaches fail to specifically consider the optimal way to extract depression-related features from each of them, as well as investigates the best fusion strategy. In this paper, we propose to extend Neural Architecture Search (NAS) technique for designing an optimal model for multiple facial attributes-based depression recognition, which can be efficiently and robustly implemented in a small dataset. Our approach first conducts a warmer up step to the feature extractor of each facial attribute, aiming to largely reduce the search space and providing customized architecture, where each feature extractor can be either a Convolution Neural Networks (CNN) or Graph Neural Networks (GNN). Then, we conduct an end-to-end architecture search for all feature extractors and the fusion network, allowing the complementary depression cues to be optimally combined with less redundancy. The experimental results on AVEC 2016 dataset show that the model explored by our approach achieves breakthrough performance with 27\% and 30\% RMSE and MAE improvements over the existing state-of-the-art. In light of these findings, this paper provides solid evidences and a strong baseline for applying NAS to time-series data-based mental health analysis.
SimpleX: A Simple and Strong Baseline for Collaborative Filtering
Mao, Kelong, Zhu, Jieming, Wang, Jinpeng, Dai, Quanyu, Dong, Zhenhua, Xiao, Xi, He, Xiuqiang
Collaborative filtering (CF) is a widely studied research topic in recommender systems. The learning of a CF model generally depends on three major components, namely interaction encoder, loss function, and negative sampling. While many existing studies focus on the design of more powerful interaction encoders, the impacts of loss functions and negative sampling ratios have not yet been well explored. In this work, we show that the choice of loss function as well as negative sampling ratio is equivalently important. More specifically, we propose the cosine contrastive loss (CCL) and further incorporate it to a simple unified CF model, dubbed SimpleX. Extensive experiments have been conducted on 11 benchmark datasets and compared with 29 existing CF models in total. Surprisingly, the results show that, under our CCL loss and a large negative sampling ratio, SimpleX can surpass most sophisticated state-of-the-art models by a large margin (e.g., max 48.5% improvement in NDCG@20 over LightGCN). We believe that SimpleX could not only serve as a simple strong baseline to foster future research on CF, but also shed light on the potential research direction towards improving loss function and negative sampling.
Robust Model-based Reinforcement Learning for Autonomous Greenhouse Control
Zhang, Wanpeng, Cao, Xiaoyan, Yao, Yao, An, Zhicheng, Luo, Dijun, Xiao, Xi
Due to the high efficiency and less weather dependency, autonomous greenhouses provide an ideal solution to meet the increasing demand for fresh food. However, managers are faced with some challenges in finding appropriate control strategies for crop growth, since the decision space of the greenhouse control problem is an astronomical number. Therefore, an intelligent closed-loop control framework is highly desired to generate an automatic control policy. As a powerful tool for optimal control, reinforcement learning (RL) algorithms can surpass human beings' decision-making and can also be seamlessly integrated into the closed-loop control framework. However, in complex real-world scenarios such as agricultural automation control, where the interaction with the environment is time-consuming and expensive, the application of RL algorithms encounters two main challenges, i.e., sample efficiency and safety. Although model-based RL methods can greatly mitigate the efficiency problem of greenhouse control, the safety problem has not got too much attention. In this paper, we present a model-based robust RL framework for autonomous greenhouse control to meet the sample efficiency and safety challenges. Specifically, our framework introduces an ensemble of environment models to work as a simulator and assist in policy optimization, thereby addressing the low sample efficiency problem. As for the safety concern, we propose a sample dropout module to focus more on worst-case samples, which can help improve the adaptability of the greenhouse planting policy in extreme cases. Experimental results demonstrate that our approach can learn a more effective greenhouse planting policy with better robustness than existing methods.
Learnable Hypergraph Laplacian for Hypergraph Learning
Zhang, Jiying, Chen, Yuzhao, Xiao, Xi, Lu, Runiu, Xia, Shu-Tao
HyperGraph Convolutional Neural Networks (HGCNNs) have demonstrated their potential in modeling high-order relations preserved in graph structured data. However, most existing convolution filters are localized and determined by the pre-defined initial hypergraph topology, neglecting to explore implicit and long-ange relations in real-world data. In this paper, we propose the first learning-based method tailored for constructing adaptive hypergraph structure, termed HypERgrAph Laplacian aDaptor (HERALD), which serves as a generic plug-in-play module for improving the representational power of HGCNNs. Specifically, HERALD adaptively optimizes the adjacency relationship between hypernodes and hyperedges in an end-to-end manner and thus the task-aware hypergraph is learned. Furthermore, HERALD employs the self-attention mechanism to capture the non-local paired-nodes relation. Extensive experiments on various popular hypergraph datasets for node classification and graph classification tasks demonstrate that our approach obtains consistent and considerable performance enhancement, proving its effectiveness and generalization ability.
On Self-Distilling Graph Neural Network
Chen, Yuzhao, Bian, Yatao, Xiao, Xi, Rong, Yu, Xu, Tingyang, Huang, Junzhou
Recently, the teacher-student knowledge distillation framework has demonstrated its potential in training Graph Neural Networks (GNNs). However, due to the difficulty of training deep and wide GNN models, one can not always obtain a satisfactory teacher model for distillation. Furthermore, the inefficient training process of teacher-student knowledge distillation also impedes its applications in GNN models. In this paper, we propose the first teacher-free knowledge distillation framework for GNNs, termed GNN Self-Distillation (GNN-SD), that serves as a drop-in replacement for improving the training process of GNNs.We design three knowledge sources for GNN-SD: neighborhood discrepancy rate (NDR), compact graph embedding and intermediate logits. Notably, serving as a metric of the non-smoothness of the embedded graph, NDR empowers the transferability of knowledge that maintains high neighborhood discrepancy by enforcing consistency between consecutive GNN layers. We conduct exploring analysis to verify that our framework could improve the training dynamics and embedding quality of GNNs. Extensive experiments on various popular GNN models and datasets demonstrate that our approach obtains consistent and considerable performance enhancement, proving its effectiveness and generalization ability.
Inverse Graph Identification: Can We Identify Node Labels Given Graph Labels?
Bian, Tian, Xiao, Xi, Xu, Tingyang, Rong, Yu, Huang, Wenbing, Zhao, Peilin, Huang, Junzhou
Graph Identification (GI) has long been researched in graph learning and is essential in certain applications (e.g. social community detection). Specifically, GI requires to predict the label/score of a target graph given its collection of node features and edge connections. While this task is common, more complex cases arise in practice---we are supposed to do the inverse thing by, for example, grouping similar users in a social network given the labels of different communities. This triggers an interesting thought: can we identify nodes given the labels of the graphs they belong to? Therefore, this paper defines a novel problem dubbed Inverse Graph Identification (IGI), as opposed to GI. Upon a formal discussion of the variants of IGI, we choose a particular case study of node clustering by making use of the graph labels and node features, with an assistance of a hierarchical graph that further characterizes the connections between different graphs. To address this task, we propose Gaussian Mixture Graph Convolutional Network (GMGCN), a simple yet effective method that makes the node-level message passing process using Graph Attention Network (GAT) under the protocol of GI and then infers the category of each node via a Gaussian Mixture Layer (GML). The training of GMGCN is further boosted by a proposed consensus loss to take advantage of the structure of the hierarchical graph. Extensive experiments are conducted to test the rationality of the formulation of IGI. We verify the superiority of the proposed method compared to other baselines on several benchmarks we have built up. We will release our codes along with the benchmark data to facilitate more research attention to the IGI problem.
Self-Paced Probabilistic Principal Component Analysis for Data with Outliers
Zhao, Bowen, Xiao, Xi, Zhang, Wanpeng, Zhang, Bin, Xia, Shutao
Principal Component Analysis (PCA) is a popular tool for dimensionality reduction and feature extraction in data analysis. There is a probabilistic version of PCA, known as Probabilistic PCA (PPCA). However, standard PCA and PPCA are not robust, as they are sensitive to outliers. To alleviate this problem, this paper introduces the Self-Paced Learning mechanism into PPCA, and proposes a novel method called Self-Paced Probabilistic Principal Component Analysis (SP-PPCA). Furthermore, we design the corresponding optimization algorithm based on the alternative search strategy and the expectation-maximization algorithm. SP-PPCA looks for optimal projection vectors and filters out outliers iteratively. Experiments on both synthetic problems and real-world datasets clearly demonstrate that SP-PPCA is able to reduce or eliminate the impact of outliers.