Ding, Chris
K-means Derived Unsupervised Feature Selection using Improved ADMM
Sun, Ziheng, Ding, Chris, Fan, Jicong
JOURNAL OF L A T EX CLASS FILES, VOL. 18, NO. 9, SEPTEMBER 2020 1 K-means Derived Unsupervised Feature Selection using Improved ADMM Ziheng Sun, Chris Ding, and Jicong Fan Abstract --Feature selection is important for high-dimensional data analysis and is non-trivial in unsupervised learning problems such as dimensionality reduction and clustering. The goal of unsupervised feature selection is finding a subset of features such that the data points from different clusters are well separated. This paper presents a novel method called K-means Derived Unsupervised Feature Selection (K-means UFS). Unlike most existing spectral analysis based unsupervised feature selection methods, we select features using the objective of K-means. We develop an alternating direction method of multipliers (ADMM) to solve the NP-hard optimization problem of our K-means UFS model. Extensive experiments on real datasets show that our K-means UFS is more effective than the baselines in selecting features for clustering. I NTRODUCTION F EA TURE selection aims to select a subset among a large number of features and is particularly useful in dealing with high-dimensional data such as gene data in bioinformatics. The selected features should preserve the most important information of the data for downstream tasks such as classification and clustering. Many unsupervised feature selection methods have been proposed in the past decades.
Fuzzy K-Means Clustering without Cluster Centroids
Lu, Han, Li, Fangfang, Gao, Quanxue, Deng, Cheng, Ding, Chris, Wang, Qianqian
Fuzzy K-Means clustering is a critical technique in unsupervised data analysis. However, the performance of popular Fuzzy K-Means algorithms is sensitive to the selection of initial cluster centroids and is also affected by noise when updating mean cluster centroids. To address these challenges, this paper proposes a novel Fuzzy K-Means clustering algorithm that entirely eliminates the reliance on cluster centroids, obtaining membership matrices solely through distance matrix computation. This innovation enhances flexibility in distance measurement between sample points, thus improving the algorithm's performance and robustness. The paper also establishes theoretical connections between the proposed model and popular Fuzzy K-Means clustering techniques. Experimental results on several real datasets demonstrate the effectiveness of the algorithm.
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning
Li, Longkang, Liang, Siyuan, Zhu, Zihao, Ding, Chris, Zha, Hongyuan, Wu, Baoyuan
The permutation flow shop scheduling (PFSS), aiming at finding the optimal permutation of jobs, is widely used in manufacturing systems. When solving large-scale PFSS problems, traditional optimization algorithms such as heuristics could hardly meet the demands of both solution accuracy and computational efficiency, thus learning-based methods have recently garnered more attention. Some work attempts to solve the problems by reinforcement learning methods, which suffer from slow convergence issues during training and are still not accurate enough regarding the solutions. To that end, we propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately. Moreover, in order to extract better feature representations of input jobs, we incorporate the graph structure as the encoder. The extensive experiments reveal that our proposed model obtains significant promotion and presents excellent generalizability in large-scale problems with up to 1000 jobs. Compared to the state-of-the-art reinforcement learning method, our model's network parameters are reduced to only 37\% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8\% to 1.3\% on average. The code is available at: \url{https://github.com/longkangli/PFSS-IL}.
Weighted Sparse Partial Least Squares for Joint Sample and Feature Selection
Min, Wenwen, Xu, Taosheng, Ding, Chris
Sparse Partial Least Squares (sPLS) is a common dimensionality reduction technique for data fusion, which projects data samples from two views by seeking linear combinations with a small number of variables with the maximum variance. However, sPLS extracts the combinations between two data sets with all data samples so that it cannot detect latent subsets of samples. To extend the application of sPLS by identifying a specific subset of samples and remove outliers, we propose an $\ell_\infty/\ell_0$-norm constrained weighted sparse PLS ($\ell_\infty/\ell_0$-wsPLS) method for joint sample and feature selection, where the $\ell_\infty/\ell_0$-norm constrains are used to select a subset of samples. We prove that the $\ell_\infty/\ell_0$-norm constrains have the Kurdyka-\L{ojasiewicz}~property so that a globally convergent algorithm is developed to solve it. Moreover, multi-view data with a same set of samples can be available in various real problems. To this end, we extend the $\ell_\infty/\ell_0$-wsPLS model and propose two multi-view wsPLS models for multi-view data fusion. We develop an efficient iterative algorithm for each multi-view wsPLS model and show its convergence property. As well as numerical and biomedical data experiments demonstrate the efficiency of the proposed methods.
MetaLR: Meta-tuning of Learning Rates for Transfer Learning in Medical Imaging
Chen, Yixiong, Liu, Li, Li, Jingxian, Jiang, Hua, Ding, Chris, Zhou, Zongwei
In medical image analysis, transfer learning is a powerful method for deep neural networks (DNNs) to generalize well on limited medical data. Prior efforts have focused on developing pre-training algorithms on domains such as lung ultrasound, chest X-ray, and liver CT to bridge domain gaps. However, we find that model fine-tuning also plays a crucial role in adapting medical knowledge to target tasks. The common fine-tuning method is manually picking transferable layers (e.g., the last few layers) to update, which is labor-expensive. In this work, we propose a meta-learning-based LR tuner, named MetaLR, to make different layers automatically co-adapt to downstream tasks based on their transferabilities across domains. MetaLR learns appropriate LRs for different layers in an online manner, preventing highly transferable layers from forgetting their medical representation abilities and driving less transferable layers to adapt actively to new domains. Extensive experiments on various medical applications show that MetaLR outperforms previous state-of-the-art (SOTA) fine-tuning strategies.
X-IQE: eXplainable Image Quality Evaluation for Text-to-Image Generation with Visual Large Language Models
Chen, Yixiong, Liu, Li, Ding, Chris
This paper introduces a novel explainable image quality evaluation approach called X-IQE, which leverages visual large language models (LLMs) to evaluate text-to-image generation methods by generating textual explanations. X-IQE utilizes a hierarchical Chain of Thought (CoT) to enable MiniGPT-4 to produce self-consistent, unbiased texts that are highly correlated with human evaluation. It offers several advantages, including the ability to distinguish between real and generated images, evaluate text-image alignment, and assess image aesthetics without requiring model training or fine-tuning. X-IQE is more cost-effective and efficient compared to human evaluation, while significantly enhancing the transparency and explainability of deep image quality evaluation models. We validate the effectiveness of our method as a benchmark using images generated by prevalent diffusion models. X-IQE demonstrates similar performance to state-of-the-art (SOTA) evaluation methods on COCO Caption, while overcoming the limitations of previous evaluation models on DrawBench, particularly in handling ambiguous generation prompts and text recognition in generated images. Project website: https://github.com/Schuture/Benchmarking-Awesome-Diffusion-Models
Rethinking Two Consensuses of the Transferability in Deep Learning
Chen, Yixiong, Li, Jingxian, Ding, Chris, Liu, Li
Deep transfer learning (DTL) has formed a long-term quest toward enabling deep neural networks (DNNs) to reuse historical experiences as efficiently as humans. This ability is named knowledge transferability. A commonly used paradigm for DTL is firstly learning general knowledge (pre-training) and then reusing (fine-tuning) them for a specific target task. There are two consensuses of transferability of pre-trained DNNs: (1) a larger domain gap between pre-training and downstream data brings lower transferability; (2) the transferability gradually decreases from lower layers (near input) to higher layers (near output). However, these consensuses were basically drawn from the experiments based on natural images, which limits their scope of application. This work aims to study and complement them from a broader perspective by proposing a method to measure the transferability of pre-trained DNN parameters. Our experiments on twelve diverse image classification datasets get similar conclusions to the previous consensuses. More importantly, two new findings are presented, i.e., (1) in addition to the domain gap, a larger data amount and huge dataset diversity of downstream target task also prohibit the transferability; (2) although the lower layers learn basic image features, they are usually not the most transferable layers due to their domain sensitivity.
Regularized Singular Value Decomposition and Application to Recommender System
Zheng, Shuai, Ding, Chris, Nie, Feiping
Singular value decomposition (SVD) is the mathematical basis of principal component analysis (PCA). Together, SVD and PCA are one of the most widely used mathematical formalism/decomposition in machine learning, data mining, pattern recognition, artificial intelligence, computer vision, signal processing, etc. In recent applications, regularization becomes an increasing trend. In this paper, we present a regularized SVD (RSVD), present an efficient computational algorithm, and provide several theoretical analysis. We show that although RSVD is non-convex, it has a closed-form global optimal solution. Finally, we apply RSVD to the application of recommender system and experimental result show that RSVD outperforms SVD significantly.
Minimal Support Vector Machine
Zheng, Shuai, Ding, Chris
Support Vector Machine (SVM) is an efficient classification approach, which finds a hyperplane to separate data from different classes. This hyperplane is determined by support vectors. In existing SVM formulations, the objective function uses L2 norm or L1 norm on slack variables. The number of support vectors is a measure of generalization errors. In this work, we propose a Minimal SVM, which uses L0.5 norm on slack variables. The result model further reduces the number of support vectors and increases the classification performance.
Exercise-Enhanced Sequential Modeling for Student Performance Prediction
Su, Yu (Anhui University) | Liu, Qingwen (iFLYTEK CO.,LTD. ) | Liu, Qi (iFLYTEK CO.,LTD.) | Huang, Zhenya (University of Science and Technology of China ) | Yin, Yu ( University of Science and Technology of China ) | Chen, Enhong ( University of Science and Technology of China ) | Ding, Chris ( University of Science and Technology of China ) | Wei, Si ( University of Science and Technology of China ) | Hu, Guoping (University of Texas at Arlington)
In online education systems, for offering proactive services to students (e.g., personalized exercise recommendation), a crucial demand is to predict student performance (e.g., scores) on future exercising activities. Existing prediction methods mainly exploit the historical exercising records of students, where each exercise is usually represented as the manually labeled knowledge concepts, and the richer information contained in the text description of exercises is still underexplored. In this paper, we propose a novel Exercise-Enhanced Recurrent Neural Network (EERNN) framework for student performance prediction by taking full advantage of both student exercising records and the text of each exercise. Specifically, for modeling the student exercising process, we first design a bidirectional LSTM to learn each exercise representation from its text description without any expertise and information loss. Then, we propose a new LSTM architecture to trace student states (i.e., knowledge states) in their sequential exercising process with the combination of exercise representations. For making final predictions, we design two strategies under EERNN, i.e., EERNNM with Markov property and EERNNA with Attention mechanism. Extensive experiments on large-scale real-world data clearly demonstrate the effectiveness of EERNN framework. Moreover, by incorporating the exercise correlations, EERNN can well deal with the cold start problems from both student and exercise perspectives.