Chen, Yanjie
Initializing Models with Larger Ones
Xu, Zhiqiu, Chen, Yanjie, Vishniakov, Kirill, Yin, Yida, Shen, Zhiqiang, Darrell, Trevor, Liu, Lingjie, Liu, Zhuang
Weight initialization plays an important role in neural network training. Widely used initialization methods are proposed and evaluated for networks that are trained from scratch. However, the growing number of pretrained models now offers new opportunities for tackling this classical problem of weight initialization. In this work, we introduce weight selection, a method for initializing smaller models by selecting a subset of weights from a pretrained larger model. This enables the transfer of knowledge from pretrained weights to smaller models. Our experiments demonstrate that weight selection can significantly enhance the performance of small models and reduce their training time. Notably, it can also be used together with knowledge distillation. Weight selection offers a new approach to leverage the power of pretrained models in resource-constrained settings, and we hope it can be a useful tool for training small models in the large-model era. The initialization of neural network weights is crucial for their optimization. Proper initialization aids in model convergence and prevents issues like gradient vanishing. Two prominent initialization techniques, Xavier initialization (Glorot & Bengio, 2010) and Kaiming initialization (He et al., 2015), have played substantial roles in neural network training.
Artificial Intelligence Enhanced Rapid and Efficient Diagnosis of Mycoplasma Pneumoniae Pneumonia in Children Patients
Pan, Chenglin, Yan, Kuan, Liu, Xiao, Chen, Yanjie, Luo, Yanyan, Li, Xiaoming, Nie, Zhenguo, Liu, Xinjun
Artificial intelligence methods have been increasingly turning into a potentially powerful tool in the diagnosis and management of diseases. In this study, we utilized logistic regression (LR), decision tree (DT), gradient boosted decision tree (GBDT), support vector machine (SVM), and multilayer perceptron (MLP) as machine learning models to rapidly diagnose the mycoplasma pneumoniae pneumonia (MPP) in children patients. The classification task was carried out after applying the preprocessing procedure to the MPP dataset. The most efficient results are obtained by GBDT. It provides the best performance with an accuracy of 93.7%. In contrast to standard raw feature weighting, the feature importance takes the underlying correlation structure of the features into account. The most crucial feature of GBDT is the "pulmonary infiltrates range" with a score of 0.5925, followed by "cough" (0.0953) and "pleural effusion" (0.0492). We publicly share our full implementation with the dataset and trained models at https://github.com/zhenguonie/2021_AI4MPP.