Plotting

 Wang, Chen


Memetic EDA-Based Approaches to Comprehensive Quality-Aware Automated Semantic Web Service Composition

arXiv.org Artificial Intelligence

Comprehensive quality-aware automated semantic web service composition is an NP-hard problem, where service composition workflows are unknown, and comprehensive quality, i.e., Quality of services (QoS) and Quality of semantic matchmaking (QoSM) are simultaneously optimized. The objective of this problem is to find a solution with optimized or near-optimized overall QoS and QoSM within polynomial time over a service request. In this paper, we proposed novel memetic EDA-based approaches to tackle this problem. The proposed method investigates the effectiveness of several neighborhood structures of composite services by proposing domain-dependent local search operators. Apart from that, a joint strategy of the local search procedure is proposed to integrate with a modified EDA to reduce the overall computation time of our memetic approach. To better demonstrate the effectiveness and scalability of our approach, we create a more challenging, augmented version of the service composition benchmark based on WSC-08 \cite{bansal2008wsc} and WSC-09 \cite{kona2009wsc}. Experimental results on this benchmark show that one of our proposed memetic EDA-based approach (i.e., MEEDA-LOP) significantly outperforms existing state-of-the-art algorithms.


The Vulnerabilities of Graph Convolutional Networks: Stronger Attacks and Defensive Techniques

arXiv.org Machine Learning

Graph deep learning models, such as graph convolutional networks (GCN) achieve remarkable performance for tasks on graph data. Similar to other types of deep models, graph deep learning models often suffer from adversarial attacks. However, compared with non-graph data, the discrete features, graph connections and different definitions of imperceptible perturbations bring unique challenges and opportunities for the adversarial attacks and defences for graph data. In this paper, we propose both attack and defence techniques. For attack, we show that the discrete feature problem could easily be resolved by introducing integrated gradients which could accurately reflect the effect of perturbing certain features or edges while still benefiting from the parallel computations. For defence, we propose to partially learn the adjacency matrix to integrate the information of distant nodes so that the prediction of a certain target is supported by more global graph information rather than just few neighbour nodes. This, therefore, makes the attacks harder since one need to perturb more features/edges to make the attacks succeed. Our experiments on a number of datasets show the effectiveness of the proposed methods.


Evolutionary Multitasking for Semantic Web Service Composition

arXiv.org Artificial Intelligence

Web services are basic functions of a software system to support the concept of service-oriented architecture. They are often composed together to provide added values, known as web service composition. Researchers often employ Evolutionary Computation techniques to efficiently construct composite services with near-optimized functional quality (i.e., Quality of Semantic Matchmaking) or non-functional quality (i.e., Quality of Service) or both due to the complexity of this problem. With a significant increase in service composition requests, many composition requests have similar input and output requirements but may vary due to different preferences from different user segments. This problem is often treated as a multi-objective service composition so as to cope with different preferences from different user segments simultaneously. Without taking a multi-objective approach that gives rise to a solution selection challenge, we perceive multiple similar service composition requests as jointly forming an evolutionary multi-tasking problem in this work. We propose an effective permutation-based evolutionary multi-tasking approach that can simultaneously generate a set of solutions, with one for each service request. We also introduce a neighborhood structure over multiple tasks to allow newly evolved solutions to be evaluated on related tasks. Our proposed method can perform better at the cost of only a fraction of time, compared to one state-of-art single-tasking EC-based method. We also found that the use of the proper neighborhood structure can enhance the effectiveness of our approach.


WaveletFCNN: A Deep Time Series Classification Model for Wind Turbine Blade Icing Detection

arXiv.org Machine Learning

Wind power, as an alternative to burning fossil fuels, is plentiful and renewable. Data-driven approaches are increasingly popular for inspecting the wind turbine failures. In this paper, we propose a novel classification-based anomaly detection system for icing detection of the wind turbine blades. We effectively combine the deep neural networks and wavelet transformation to identify such failures sequentially across the time. In the training phase, we present a wavelet based fully convolutional neural network (FCNN), namely WaveletFCNN, for the time series classification. We improve the original (FCNN) by augmenting features with the wavelet coefficients. WaveletFCNN outperforms the state-of-the-art FCNN for the univariate time series classification on the UCR time series archive benchmarks. In the detecting phase, we combine the sliding window and majority vote algorithms to provide the timely monitoring of the anomalies. The system has been successfully implemented on a real-world dataset from Goldwind Inc, where the classifier is trained on a multivariate time series dataset and the monitoring algorithm is implemented to capture the abnormal condition on signals from a wind farm.


Vector Quantization as Sparse Least Square Optimization

arXiv.org Machine Learning

Vector quantization aims to form new vectors/matrices with shared values close to the original. It could compress data with acceptable information loss, and could be of great usefulness in areas like Image Processing, Pattern Recognition and Machine Learning. In recent years, the importance of quantization has been soaring as it has been discovered huge potentials in deploying practical neural networks, which is among one of the most popular research topics. Conventional vector quantization methods usually suffer from their own flaws: hand-coding domain rules quantization could produce poor results when encountering complex data, and clustering-based algorithms have the problem of inexact solution and high time consumption. In this paper, we explored vector quantization problem from a new perspective of sparse least square optimization and designed multiple algorithms with their program implementations. Specifically, deriving from a sparse form of coefficient matrix, three types of sparse least squares, with $l_0$, $l_1$, and generalized $l_1 + l_2$ penalizations, are designed and implemented respectively. In addition, to produce quantization results with given amount of quantized values(instead of penalization coefficient $\lambda$), this paper proposed a cluster-based least square quantization method, which could also be regarded as an improvement of information preservation of conventional clustering algorithm. The algorithms were tested on various data and tasks and their computational properties were analyzed. The paper offers a new perspective to probe the area of vector quantization, while the algorithms proposed could provide more appropriate options for quantization tasks under different circumstances.


Discriminative Semi-Supervised Feature Selection via Rescaled Least Squares Regression-Supplement

AAAI Conferences

In this paper, we propose a Discriminative Semi-Supervised Feature Selection (DSSFS) method. In this method, a ฮต-dragging technique is introduced to the Rescaled Linear Square Regression in order to enlarge the distances between different classes. An iterative method is proposed to simultaneously learn the regression coefficients, ฮต-draggings matrix and predicting the unknown class labels. Experimental results show the superiority of DSSFS.


Kernel Cross-Correlator

AAAI Conferences

Cross-correlator plays a significant role in many visual perception tasks, such as object detection and tracking. Beyond the linear cross-correlator, this paper proposes a kernel cross-correlator (KCC) that breaks traditional limitations. First, by introducing the kernel trick, the KCC extends the linear cross-correlation to non-linear space, which is more robust to signal noises and distortions. Second, the connection to the existing works shows that KCC provides a unified solution for correlation filters. Third, KCC is applicable to any kernel function and is not limited to circulant structure on training data, thus it is able to predict affine transformations with customized properties. Last, by leveraging the fast Fourier transform (FFT), KCC eliminates direct calculation of kernel vectors, thus achieves better performance yet still with a reasonable computational cost. Comprehensive experiments on visual tracking and human activity recognition using wearable devices demonstrate its robustness, flexibility, and efficiency. The source codes of both experiments are released at https://github.com/wang-chen/KCC.


Robust Propensity Score Computation Method based on Machine Learning with Label-corrupted Data

arXiv.org Machine Learning

In biostatistics, propensity score is a common approach to analyze the imbalance of covariate and process confounding covariates to eliminate differences between groups. While there are an abundant amount of methods to compute propensity score, a common issue of them is the corrupted labels in the dataset. For example, the data collected from the patients could contain samples that are treated mistakenly, and the computing methods could incorporate them as a misleading information. In this paper, we propose a Machine Learning-based method to handle the problem. Specifically, we utilize the fact that the majority of sample should be labeled with the correct instance and design an approach to first cluster the data with spectral clustering and then sample a new dataset with a distribution processed from the clustering results. The propensity score is computed by Xgboost, and a mathematical justification of our method is provided in this paper. The experimental results illustrate that xgboost propensity scores computing with the data processed by our method could outperform the same method with original data, and the advantages of our method increases as we add some artificial corruptions to the dataset. Meanwhile, the implementation of xgboost to compute propensity score for multiple treatments is also a pioneering work in the area.


Extracting Highly Effective Features for Supervised Learning via Simultaneous Tensor Factorization

AAAI Conferences

Real world data is usually generated over multiple time periods associated with multiple labels, which can be represented as multiple labeled tensor sequences. These sequences are linked together, sharing some common features while exhibiting their own unique features. Conventional tensor factorization techniques are limited to extract either common or unique features, but not both simultaneously. However, both types of these features are important in many machine learning systems as they inherently affect the systems' performance. In this paper, we propose a novel supervised tensor factorization technique which simultaneously extracts ordered common and unique features. Classification results using features extracted by our method on CIFAR-10 database achieves significantly better performance over other factorization methods, illustrating the effectiveness of the proposed technique.