Goto

Collaborating Authors

 Liu, Jianyu


MIM: Multi-modal Content Interest Modeling Paradigm for User Behavior Modeling

arXiv.org Artificial Intelligence

Click-Through Rate (CTR) prediction is a crucial task in recommendation systems, online searches, and advertising platforms, where accurately capturing users' real interests in content is essential for performance. However, existing methods heavily rely on ID embeddings, which fail to reflect users' true preferences for content such as images and titles. This limitation becomes particularly evident in cold-start and long-tail scenarios, where traditional approaches struggle to deliver effective results. To address these challenges, we propose a novel Multi-modal Content Interest Modeling paradigm (MIM), which consists of three key stages: Pre-training, Content-Interest-Aware Supervised Fine-Tuning (C-SFT), and Content-Interest-Aware UBM (CiUBM). The pre-training stage adapts foundational models to domain-specific data, enabling the extraction of high-quality multi-modal embeddings. The C-SFT stage bridges the semantic gap between content and user interests by leveraging user behavior signals to guide the alignment of embeddings with user preferences. Finally, the CiUBM stage integrates multi-modal embeddings and ID-based collaborative filtering signals into a unified framework. Comprehensive offline experiments and online A/B tests conducted on the Taobao, one of the world's largest e-commerce platforms, demonstrated the effectiveness and efficiency of MIM method. The method has been successfully deployed online, achieving a significant increase of +14.14% in CTR and +4.12% in RPM, showcasing its industrial applicability and substantial impact on platform performance. To promote further research, we have publicly released the code and dataset at https://pan.quark.cn/s/8fc8ec3e74f3.


PRIMO: Progressive Induction for Multi-hop Open Rule Generation

arXiv.org Artificial Intelligence

Open rule refer to the implication from premise atoms to hypothesis atoms, which captures various relations between instances in the real world. Injecting open rule knowledge into the machine helps to improve the performance of downstream tasks such as dialogue and relation extraction. Existing approaches focus on single-hop open rule generation, ignoring multi-hop scenarios, leading to logical inconsistencies between premise and hypothesis atoms, as well as semantic duplication of generated rule atoms. To address these issues, we propose a progressive multi-stage open rule generation method called PRIMO. We introduce ontology information during the rule generation stage to reduce ambiguity and improve rule accuracy. PRIMO constructs a multi-stage structure consisting of generation, extraction, and ranking modules to fully leverage the latent knowledge within the language model across multiple dimensions. Furthermore, we employ reinforcement learning from human feedback to further optimize model, enhancing the model's understanding of commonsense knowledge. Experiments show that compared to baseline models, PRIMO significantly improves rule quality and diversity while reducing the repetition rate of rule atoms.


A Combination Model Based on Sequential General Variational Mode Decomposition Method for Time Series Prediction

arXiv.org Artificial Intelligence

For example, combining ARIMA with various decomposition algorithms such as Empirical Mode Decomposition (EMD) and Variational Mode Decomposition (VMD) for predicting complex time series; For example, using an improved ARMA model for stock market forecasting. However, the above models need to be built on the basis of stable sequence data, and usually require testing and preprocessing of the original data, which may lead to the loss of some hidden information, especially in big data samples, and this disadvantage is easily magnified. With the development of computer technology, intelligent models represented by artificial neural networks (ANNs) are gradually emerging. This type of model is good at handling incomplete, fuzzy, uncertain, or irregular data, and has a good fit to nonlinear relationships. Shallow neural networks represented by backpropagation neural networks (BPNN) and shallow machine learning represented by support vector machines (SVM) are also widely used in financial market prediction. However, shallow neural networks do not consider the temporal nature of data, and financial time series often have certain long-term dependencies. Therefore, recurrent neural networks (RNNs) with memory function have become the latest choice. The output of RNN at a certain moment can be used as input to feedback to neurons again, and this cascade structure is very suitable for time series data, which can preserve the dependency relationships in the data.


A Combination Model for Time Series Prediction using LSTM via Extracting Dynamic Features Based on Spatial Smoothing and Sequential General Variational Mode Decomposition

arXiv.org Artificial Intelligence

In order to solve the problems such as difficult to extract effective features and low accuracy of sales volume prediction caused by complex relationships such as market sales volume in time series prediction, we proposed a time series prediction method of market sales volume based on Sequential General VMD and spatial smoothing Long short-term memory neural network (SS-LSTM) combination model. Firstly, the spatial smoothing algorithm is used to decompose and calculate the sample data of related industry sectors affected by the linkage effect of market sectors, extracting modal features containing information via Sequential General VMD on overall market and specific price trends; Then, according to the background of different Market data sets, LSTM network is used to model and predict the price of fundamental data and modal characteristics. The experimental results of data prediction with seasonal and periodic trends show that this method can achieve higher price prediction accuracy and more accurate accuracy in specific market contexts compared to traditional prediction methods Describe the changes in market sales volume.


A Web Scale Entity Extraction System

arXiv.org Artificial Intelligence

Understanding the semantic meaning of content on the web through the lens of entities and concepts has many practical advantages. However, when building large-scale entity extraction systems, practitioners are facing unique challenges involving finding the best ways to leverage the scale and variety of data available on internet platforms. We present learnings from our efforts in building an entity extraction system for multiple document types at large scale using multi-modal Transformers. We empirically demonstrate the effectiveness of multi-lingual, multi-task and cross-document type learning. We also discuss the label collection schemes that help to minimize the amount of noise in the collected data.


Clustering as an Evaluation Protocol for Knowledge Embedding Representation of Categorised Multi-relational Data in the Clinical Domain

arXiv.org Artificial Intelligence

Learning knowledge representation is an increasingly important technology applicable in many domain-specific machine learning problems. We discuss the effectiveness of traditional Link Prediction or Knowledge Graph Completion evaluation protocol when embedding knowledge representation for categorised multi-relational data in the clinical domain. Link prediction uses to split the data into training and evaluation subsets, leading to loss of information along training and harming the knowledge representation model accuracy. We propose a Clustering Evaluation Protocol as a replacement alternative to the traditionally used evaluation tasks. We used embedding models trained by a knowledge embedding approach which has been evaluated with clinical datasets. Experimental results with Pearson and Spearman correlations show strong evidence that the novel proposed evaluation protocol is pottentially able to replace link prediction.