Yang, Chaofei
InterFormer: Towards Effective Heterogeneous Interaction Learning for Click-Through Rate Prediction
Zeng, Zhichen, Liu, Xiaolong, Hang, Mengyue, Liu, Xiaoyi, Zhou, Qinghai, Yang, Chaofei, Liu, Yiqun, Ruan, Yichen, Chen, Laming, Chen, Yuxin, Hao, Yujia, Xu, Jiaqi, Nie, Jade, Liu, Xi, Zhang, Buyun, Wen, Wei, Yuan, Siyang, Wang, Kai, Chen, Wen-Yen, Han, Yiping, Li, Huayu, Yang, Chunzhi, Long, Bo, Yu, Philip S., Tong, Hanghang, Yang, Jiyan
Click-through rate (CTR) prediction, which predicts the probability of a user clicking an ad, is a fundamental task in recommender systems. The emergence of heterogeneous information, such as user profile and behavior sequences, depicts user interests from different aspects. A mutually beneficial integration of heterogeneous information is the cornerstone towards the success of CTR prediction. However, most of the existing methods suffer from two fundamental limitations, including (1) insufficient inter-mode interaction due to the unidirectional information flow between modes, and (2) aggressive information aggregation caused by early summarization, resulting in excessive information loss. To address the above limitations, we propose a novel module named InterFormer to learn heterogeneous information interaction in an interleaving style. To achieve better interaction learning, InterFormer enables bidirectional information flow for mutually beneficial learning across different modes. To avoid aggressive information aggregation, we retain complete information in each data mode and use a separate bridging arch for effective information selection and summarization. Our proposed InterFormer achieves state-of-the-art performance on three public datasets and a large-scale industrial dataset.
A Collaborative Ensemble Framework for CTR Prediction
Liu, Xiaolong, Zeng, Zhichen, Liu, Xiaoyi, Yuan, Siyang, Song, Weinan, Hang, Mengyue, Liu, Yiqun, Yang, Chaofei, Kim, Donghyun, Chen, Wen-Yen, Yang, Jiyan, Han, Yiping, Jin, Rong, Long, Bo, Tong, Hanghang, Yu, Philip S.
Recent advances in foundation models have established scaling laws that enable the development of larger models to achieve enhanced performance, motivating extensive research into large-scale recommendation models. However, simply increasing the model size in recommendation systems, even with large amounts of data, does not always result in the expected performance improvements. In this paper, we propose a novel framework, Collaborative Ensemble Training Network (CETNet), to leverage multiple distinct models, each with its own embedding table, to capture unique feature interaction patterns. Unlike naive model scaling, our approach emphasizes diversity and collaboration through collaborative learning, where models iteratively refine their predictions. To dynamically balance contributions from each model, we introduce a confidence-based fusion mechanism using general softmax, where model confidence is computed via negation entropy. This design ensures that more confident models have a greater influence on the final prediction while benefiting from the complementary strengths of other models. We validate our framework on three public datasets (AmazonElectronics, TaobaoAds, and KuaiVideo) as well as a large-scale industrial dataset from Meta, demonstrating its superior performance over individual models and state-of-the-art baselines. Additionally, we conduct further experiments on the Criteo and Avazu datasets to compare our method with the multi-embedding paradigm. Our results show that our framework achieves comparable or better performance with smaller embedding sizes, offering a scalable and efficient solution for CTR prediction tasks.
Generative Poisoning Attack Method Against Neural Networks
Yang, Chaofei, Wu, Qing, Li, Hai, Chen, Yiran
Poisoning attack is identified as a severe security threat to machine learning algorithms. In many applications, for example, deep neural network (DNN) models collect public data as the inputs to perform re-training, where the input data can be poisoned. Although poisoning attack against support vector machines (SVM) has been extensively studied before, there is still very limited knowledge about how such attack can be implemented on neural networks (NN), especially DNNs. In this work, we first examine the possibility of applying traditional gradient-based method (named as the direct gradient method) to generate poisoned data against NNs by leveraging the gradient of the target model w.r.t. the normal data. We then propose a generative method to accelerate the generation rate of the poisoned data: an auto-encoder (generator) used to generate poisoned data is updated by a reward function of the loss, and the target NN model (discriminator) receives the poisoned data to calculate the loss w.r.t. the normal data. Our experiment results show that the generative method can speed up the poisoned data generation rate by up to 239.38x compared with the direct gradient method, with slightly lower model accuracy degradation. A countermeasure is also designed to detect such poisoning attack methods by checking the loss of the target model.