parameter pair
Subspace-Distance-Enabled Active Learning for Efficient Data-Driven Model Reduction of Parametric Dynamical Systems
Kapadia, Harshit, Benner, Peter, Feng, Lihong
In situations where the solution of a high-fidelity dynamical system needs to be evaluated repeatedly, over a vast pool of parametric configurations and in absence of access to the underlying governing equations, data-driven model reduction techniques are preferable. We propose a novel active learning approach to build a parametric data-driven reduced-order model (ROM) by greedily picking the most important parameter samples from the parameter domain. As a result, during the ROM construction phase, the number of high-fidelity solutions dynamically grow in a principled fashion. The high-fidelity solution snapshots are expressed in several parameter-specific linear subspaces, with the help of proper orthogonal decomposition (POD), and the relative distance between these subspaces is used as a guiding mechanism to perform active learning. For successfully achieving this, we provide a distance measure to evaluate the similarity between pairs of linear subspaces with different dimensions, and also show that this distance measure is a metric. The usability of the proposed subspace-distance-enabled active learning (SDE-AL) framework is demonstrated by augmenting two existing non-intrusive reduced-order modeling approaches, and providing their active-learning-driven (ActLearn) extensions, namely, SDE-ActLearn-POD-KSNN, and SDE-ActLearn-POD-NN. Furthermore, we report positive results for two parametric physical models, highlighting the efficiency of the proposed SDE-AL approach.
AMD-DBSCAN: An Adaptive Multi-density DBSCAN for datasets of extremely variable density
Wang, Ziqing, Ye, Zhirong, Du, Yuyang, Mao, Yi, Liu, Yanying, Wu, Ziling, Wang, Jun
DBSCAN has been widely used in density-based clustering algorithms. However, with the increasing demand for Multi-density clustering, previous traditional DSBCAN can not have good clustering results on Multi-density datasets. In order to address this problem, an adaptive Multi-density DBSCAN algorithm (AMD-DBSCAN) is proposed in this paper. An improved parameter adaptation method is proposed in AMD-DBSCAN to search for multiple parameter pairs (i.e., Eps and MinPts), which are the key parameters to determine the clustering results and performance, therefore allowing the model to be applied to Multi-density datasets. Moreover, only one hyperparameter is required for AMD-DBSCAN to avoid the complicated repetitive initialization operations. Furthermore, the variance of the number of neighbors (VNN) is proposed to measure the difference in density between each cluster. The experimental results show that our AMD-DBSCAN reduces execution time by an average of 75% due to lower algorithm complexity compared with the traditional adaptive algorithm. In addition, AMD-DBSCAN improves accuracy by 24.7% on average over the state-of-the-art design on Multi-density datasets of extremely variable density, while having no performance loss in Single-density scenarios. Our code and datasets are available at https://github.com/AlexandreWANG915/AMD-DBSCAN.
Gu
Model selection is an important problem of cost-sensitive SVM (CS-SVM). Although using solution path to find global optimal parameters is a powerful method for model selection, it is a challenge to extend the framework to solve two regularization parameters of CS-SVM simultaneously. To overcome this challenge, we make three main steps in this paper.
Bi-Parameter Space Partition for Cost-Sensitive SVM
Gu, Bin (Nanjing University of Information Science and Technology) | Sheng, Victor S. (University of Central Arkansas) | Li, Shuo (GE HealthCare)
Model selection is an important problem of cost-sensitive SVM (CS-SVM). Although using solution path to find global optimal parameters is a powerful method for model selection, it is a challenge to extend the framework to solve two regularization parameters of CS-SVM simultaneously. To overcome this challenge, we make three main steps in this paper. (i) A critical-regions-based bi-parameter space partition algorithm is proposed to present all piecewise linearities of CS-SVM. (ii) An invariant-regions-based bi-parameter space partition algorithm is further proposed to compute empirical errors for all parameter pairs. (iii) The global optimal solutions for K-fold cross validation are computed by superposing K invariant region based bi-parameter space partitions into one. The three steps constitute the model selection of CS-SVM which can find global optimal parameter pairs in K-fold cross validation. Experimental results on seven normal datsets and four imbalanced datasets, show that our proposed method has better generalization ability and than various kinds of grid search methods, however, with less running time.