Xie, Ying
Improving Network Threat Detection by Knowledge Graph, Large Language Model, and Imbalanced Learning
Zhang, Lili, Zhu, Quanyan, Ray, Herman, Xie, Ying
Network threat detection has been challenging due to the complexities of attack activities and the limitation of historical threat data to learn from. To help enhance the existing practices of using analytics, machine learning, and artificial intelligence methods to detect the network threats, we propose an integrated modelling framework, where Knowledge Graph is used to analyze the users' activity patterns, Imbalanced Learning techniques are used to prune and weigh Knowledge Graph, and LLM is used to retrieve and interpret the users' activities from Knowledge Graph. The proposed framework is applied to Agile Threat Detection through Online Sequential Learning. The preliminary results show the improved threat capture rate by 3%-4% and the increased interpretabilities of risk predictions based on the users' activities.
Multi-Modality Transformer for E-Commerce: Inferring User Purchase Intention to Bridge the Query-Product Gap
Mallapragada, Srivatsa, Xie, Ying, Chawan, Varsha Rani, Hailat, Zeyad, Wang, Yuanbo
E-commerce click-stream data and product catalogs offer critical user behavior insights and product knowledge. This paper propose a multi-modal transformer termed as PINCER, that leverages the above data sources to transform initial user queries into pseudo-product representations. By tapping into these external data sources, our model can infer users' potential purchase intent from their limited queries and capture query relevant product features. We demonstrate our model's superior performance over state-of-the-art alternatives on e-commerce online retrieval in both controlled and real-world experiments. Our ablation studies confirm that the proposed transformer architecture and integrated learning strategies enable the mining of key data sources to infer purchase intent, extract product features, and enhance the transformation pipeline from queries to more accurate pseudo-product representations.
Text classification in shipping industry using unsupervised models and Transformer based supervised models
Xie, Ying, Song, Dongping
Obtaining labelled data in a particular context could be expensive and time consuming. Although different algorithms, including unsupervised learning, semi-supervised learning, self-learning have been adopted, the performance of text classification varies with context. Given the lack of labelled dataset, we proposed a novel and simple unsupervised text classification model to classify cargo content in international shipping industry using the Standard International Trade Classification (SITC) codes. Our method stems from representing words using pretrained Glove Word Embeddings and finding the most likely label using Cosine Similarity. To compare unsupervised text classification model with supervised classification, we also applied several Transformer models to classify cargo content. Due to lack of training data, the SITC numerical codes and the corresponding textual descriptions were used as training data. A small number of manually labelled cargo content data was used to evaluate the classification performances of the unsupervised classification and the Transformer based supervised classification. The comparison reveals that unsupervised classification significantly outperforms Transformer based supervised classification even after increasing the size of the training dataset by 30%. Lacking training data is a key bottleneck that prohibits deep learning models (such as Transformers) from successful practical applications. Unsupervised classification can provide an alternative efficient and effective method to classify text when there is scarce training data.
Deep Embedding Kernel
Le, Linh, Xie, Ying
In this paper, we propose a novel supervised learning method that is called Deep Embedding Kernel (DEK). DEK combines the advantages of deep learning and kernel methods in a unified framework. More specifically, DEK is a learnable kernel represented by a newly designed deep architecture. Compared with pre-defined kernels, this kernel can be explicitly trained to map data to an optimized high-level feature space where data may have favorable features toward the application. Compared with typical deep learning using SoftMax or logistic regression as the top layer, DEK is expected to be more generalizable to new data. Experimental results show that DEK has superior performance than typical machine learning methods in identity detection, classification, regression, dimension reduction, and transfer learning.
Rank Ordering Constraints Elimination with Application for Kernel Learning
Xie, Ying (Anhui University) | Ding, Chris H. Q. (University of Texas at Arlington) | Gong, Yihong (Xian Jiaotong University) | Wu, Zongze (Guangdong University of Technology)
A number of machine learning domains,such as information retrieval, recommender systems, kernel learning, neural network-biological systems etc,deal with importance scores. Very often, there existsome prior knowledge that could help improve the performance.In many cases, these prior knowledge manifest themselves in the rank ordering constraints.These inequality constraints are usually very difficult to deal with in optimization.In this paper, we provide a slack variable transformation methods, which effectively eliminatesthe rank ordering inequality constraints, and thus simplify the learning task significantly.We apply this transformation in kernel learning problem, and also provide an efficient algorithm tosolved the transformed system. On seven datasets,our approach reduces the computational time by orders of magnitudes as compared to the current standardquadratically constrained quadratic programming(QCQP) optimization approach.
Uncorrelated Lasso
Chen, Si-Bao (Anhui University) | Ding, Chris (University of Texas at Arlington) | Luo, Bin (Anhui University) | Xie, Ying (Anhui University)
In this paper, motivated by the previous sparse learning In many regression applications, there are too many unrelated based research, we propose to add variable correlation into predictors which may hide the relationship between the sparse-learning-based variable selection approach. We response and the most related predictors. A common way to note that in previous Lasso-type variable selection, variable resolve this problem is variable selection, that is to select a correlations are not taken into account, while in most subset of the most representative or discriminative predictors real-life data, predictors are often correlated. Strongly correlated from the input predictor set. The central requirement is that predictors share similar properties, and have some good predictor set contains predictors that are highly correlated overlapped information.