AITopics | Zhou, Guorui

Collaborating Authors

Zhou, Guorui

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CRM: Retrieval Model with Controllable Condition

Liu, Chi, Cao, Jiangxia, Huang, Rui, Cai, Kuo, Ding, Weifeng, Luo, Qiang, Gai, Kun, Zhou, Guorui

arXiv.org Artificial IntelligenceDec-18-2024

Recommendation systems (RecSys) are designed to connect users with relevant items from a vast pool of candidates while aligning with the business goals of the platform. A typical industrial RecSys is composed of two main stages, retrieval and ranking: (1) the retrieval stage aims at searching hundreds of item candidates satisfied user interests; (2) based on the retrieved items, the ranking stage aims at selecting the best dozen items by multiple targets estimation for each item candidate, including classification and regression targets. Compared with ranking model, the retrieval model absence of item candidate information during inference, therefore retrieval models are often trained by classification target only (e.g., click-through rate), but failed to incorporate regression target (e.g., the expected watch-time), which limit the effectiveness of retrieval. In this paper, we propose the Controllable Retrieval Model (CRM), which integrates regression information as conditional features into the two-tower retrieval paradigm. This modification enables the retrieval stage could fulfill the target gap with ranking model, enhancing the retrieval model ability to search item candidates satisfied the user interests and condition effectively. We validate the effectiveness of CRM through real-world A/B testing and demonstrate its successful deployment in Kuaishou short-video recommendation system, which serves over 400 million users.

artificial intelligence, machine learning, social media, (15 more...)

arXiv.org Artificial Intelligence

2412.13844

Country:

Asia > China (0.31)
North America > United States (0.30)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.97)

Add feedback

QARM: Quantitative Alignment Multi-Modal Recommendation at Kuaishou

Luo, Xinchen, Cao, Jiangxia, Sun, Tianyu, Yu, Jinkai, Huang, Rui, Yuan, Wei, Lin, Hezheng, Zheng, Yichen, Wang, Shiyao, Hu, Qigen, Qiu, Changqing, Zhang, Jiaqi, Zhang, Xu, Yan, Zhiheng, Zhang, Jingming, Zhang, Simin, Wen, Mingxing, Liu, Zhaojie, Gai, Kun, Zhou, Guorui

arXiv.org Artificial IntelligenceNov-18-2024

In recent years, with the significant evolution of multi-modal large models, many recommender researchers realized the potential of multi-modal information for user interest modeling. In industry, a wide-used modeling architecture is a cascading paradigm: (1) first pre-training a multi-modal model to provide omnipotent representations for downstream services; (2) The downstream recommendation model takes the multi-modal representation as additional input to fit real user-item behaviours. Although such paradigm achieves remarkable improvements, however, there still exist two problems that limit model performance: (1) Representation Unmatching: The pre-trained multi-modal model is always supervised by the classic NLP/CV tasks, while the recommendation models are supervised by real user-item interaction. As a result, the two fundamentally different tasks' goals were relatively separate, and there was a lack of consistent objective on their representations; (2) Representation Unlearning: The generated multi-modal representations are always stored in cache store and serve as extra fixed input of recommendation model, thus could not be updated by recommendation model gradient, further unfriendly for downstream training. Inspired by the two difficulties challenges in downstream tasks usage, we introduce a quantitative multi-modal framework to customize the specialized and trainable multi-modal information for different downstream models.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.11739

Country: North America > United States (0.30)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Communications > Social Media (0.85)

Add feedback

KuaiFormer: Transformer-Based Retrieval at Kuaishou

Liu, Chi, Cao, Jiangxia, Huang, Rui, Zheng, Kai, Luo, Qiang, Gai, Kun, Zhou, Guorui

arXiv.org Artificial IntelligenceNov-15-2024

In large-scale content recommendation systems, retrieval serves as the initial stage in the pipeline, responsible for selecting thousands of candidate items from billions of options to pass on to ranking modules. Traditionally, the dominant retrieval method has been Embedding-Based Retrieval (EBR) using a Deep Neural Network (DNN) dual-tower structure. However, applying transformer in retrieval tasks has been the focus of recent research, though real-world industrial deployment still presents significant challenges. In this paper, we introduce KuaiFormer, a novel transformer-based retrieval framework deployed in a large-scale content recommendation system. KuaiFormer fundamentally redefines the retrieval process by shifting from conventional score estimation tasks (such as click-through rate estimate) to a transformer-driven Next Action Prediction paradigm. This shift enables more effective real-time interest acquisition and multi-interest extraction, significantly enhancing retrieval performance. KuaiFormer has been successfully integrated into Kuaishou App's short-video recommendation system since May 2024, serving over 400 million daily active users and resulting in a marked increase in average daily usage time of Kuaishou users. We provide insights into both the technical and business aspects of deploying transformer in large-scale recommendation systems, addressing practical challenges encountered during industrial implementation. Our findings offer valuable guidance for engineers and researchers aiming to leverage transformer models to optimize large-scale content recommendation systems.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.10057

Country:

North America > United States (0.16)
Asia > China (0.15)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model

Du, Xinrun, Yu, Zhouliang, Gao, Songyang, Pan, Ding, Cheng, Yuyang, Ma, Ziyang, Yuan, Ruibin, Qu, Xingwei, Liu, Jiaheng, Zheng, Tianyu, Luo, Xinchen, Zhou, Guorui, Chen, Wenhu, Zhang, Ge

arXiv.org Artificial IntelligenceJul-10-2024

In this study, we introduce CT-LLM, a 2B large language model (LLM) that illustrates a pivotal shift towards prioritizing the Chinese language in developing LLMs. Uniquely initiated from scratch, CT-LLM diverges from the conventional methodology by primarily incorporating Chinese textual data, utilizing an extensive corpus of 1,200 billion tokens, including 800 billion Chinese tokens, 300 billion English tokens, and 100 billion code tokens. This strategic composition facilitates the model's exceptional proficiency in understanding and processing Chinese, a capability further enhanced through alignment techniques. Demonstrating remarkable performance on the CHC-Bench, CT-LLM excels in Chinese language tasks, and showcases its adeptness in English through SFT. This research challenges the prevailing paradigm of training LLMs predominantly on English corpora and then adapting them to other languages, broadening the horizons for LLM training methodologies. By open-sourcing the full process of training a Chinese LLM, including a detailed data processing procedure with the obtained Massive Appropriate Pretraining Chinese Corpus (MAP-CC), a well-chosen multidisciplinary Chinese Hard Case Benchmark (CHC-Bench), and the 2B-size Chinese Tiny LLM (CT-LLM), we aim to foster further exploration and innovation in both academia and industry, paving the way for more inclusive and versatile language models.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2404.04167

Country:

North America > United States (0.14)
Asia > Middle East (0.14)
Asia > China (0.14)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

MMBee: Live Streaming Gift-Sending Recommendations via Multi-Modal Fusion and Behaviour Expansion

Deng, Jiaxin, Wang, Shiyao, Wang, Yuchen, Qi, Jiansong, Zhao, Liqin, Zhou, Guorui, Meng, Gaofeng

arXiv.org Artificial IntelligenceJun-15-2024

Live streaming services are becoming increasingly popular due to real-time interactions and entertainment. Viewers can chat and send comments or virtual gifts to express their preferences for the streamers. Accurately modeling the gifting interaction not only enhances users' experience but also increases streamers' revenue. Previous studies on live streaming gifting prediction treat this task as a conventional recommendation problem, and model users' preferences using categorical data and observed historical behaviors. However, it is challenging to precisely describe the real-time content changes in live streaming using limited categorical information. Moreover, due to the sparsity of gifting behaviors, capturing the preferences and intentions of users is quite difficult. In this work, we propose MMBee based on real-time Multi-Modal Fusion and Behaviour Expansion to address these issues. Specifically, we first present a Multi-modal Fusion Module with Learnable Query (MFQ) to perceive the dynamic content of streaming segments and process complex multi-modal interactions, including images, text comments and speech. To alleviate the sparsity issue of gifting behaviors, we present a novel Graph-guided Interest Expansion (GIE) approach that learns both user and streamer representations on large-scale gifting graphs with multi-modal attributes. Comprehensive experiment results show that MMBee achieves significant performance improvements on both public datasets and Kuaishou real-world streaming datasets and the effectiveness has been further validated through online A/B experiments. MMBee has been deployed and is serving hundreds of millions of users at Kuaishou.

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2407.00056

Country:

Europe > Spain (0.16)
Asia > China (0.15)

Genre: Research Report (1.00)

Industry: Media (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Data Science (0.95)
(2 more...)

Add feedback

Ensure Timeliness and Accuracy: A Novel Sliding Window Data Stream Paradigm for Live Streaming Recommendation

Liang, Fengqi, Zheng, Baigong, Zhao, Liqin, Zhou, Guorui, Wang, Qian, Niu, Yanan

arXiv.org Artificial IntelligenceFeb-22-2024

Live streaming recommender system is specifically designed to recommend real-time live streaming of interest to users. Due to the dynamic changes of live content, improving the timeliness of the live streaming recommender system is a critical problem. Intuitively, the timeliness of the data determines the upper bound of the timeliness that models can learn. However, none of the previous works addresses the timeliness problem of the live streaming recommender system from the perspective of data stream design. Employing the conventional fixed window data stream paradigm introduces a trade-off dilemma between labeling accuracy and timeliness. In this paper, we propose a new data stream design paradigm, dubbed Sliver, that addresses the timeliness and accuracy problem of labels by reducing the window size and implementing a sliding window correspondingly. Meanwhile, we propose a time-sensitive re-reco strategy reducing the latency between request and impression to improve the timeliness of the recommendation service and features by periodically requesting the recommendation service. To demonstrate the effectiveness of our approach, we conduct offline experiments on a multi-task live streaming dataset with labeling timestamps collected from the Kuaishou live streaming platform. Experimental results demonstrate that Sliver outperforms two fixed-window data streams with varying window sizes across all targets in four typical multi-task recommendation models. Furthermore, we deployed Sliver on the Kuaishou live streaming platform. Results of the online A/B test show a significant improvement in click-through rate (CTR), and new follow number (NFN), further validating the effectiveness of Sliver.

artificial intelligence, data stream, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2402.14399

Country:

Asia > China (0.15)
North America > United States (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Education > Educational Setting > Online (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Multi-behavior Self-supervised Learning for Recommendation

Xu, Jingcao, Wang, Chaokun, Wu, Cheng, Song, Yang, Zheng, Kai, Wang, Xiaowei, Wang, Changping, Zhou, Guorui, Gai, Kun

arXiv.org Artificial IntelligenceMay-22-2023

Modern recommender systems often deal with a variety of user interactions, e.g., click, forward, purchase, etc., which requires the underlying recommender engines to fully understand and leverage multi-behavior data from users. Despite recent efforts towards making use of heterogeneous data, multi-behavior recommendation still faces great challenges. Firstly, sparse target signals and noisy auxiliary interactions remain an issue. Secondly, existing methods utilizing self-supervised learning (SSL) to tackle the data sparsity neglect the serious optimization imbalance between the SSL task and the target task. Hence, we propose a Multi-Behavior Self-Supervised Learning (MBSSL) framework together with an adaptive optimization method. Specifically, we devise a behavior-aware graph neural network incorporating the self-attention mechanism to capture behavior multiplicity and dependencies. To increase the robustness to data sparsity under the target behavior and noisy interactions from auxiliary behaviors, we propose a novel self-supervised learning paradigm to conduct node self-discrimination at both inter-behavior and intra-behavior levels. In addition, we develop a customized optimization strategy through hybrid manipulation on gradients to adaptively balance the self-supervised learning task and the main supervised recommendation task. Extensive experiments on five real-world datasets demonstrate the consistent improvements obtained by MBSSL over ten state-of-the art (SOTA) baselines. We release our model implementation at: https://github.com/Scofield666/MBSSL.git.

artificial intelligence, inductive learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3539618.3591734

2305.18238

Country:

Asia > Taiwan (0.16)
Asia > China (0.15)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

Instant Representation Learning for Recommendation over Large Dynamic Graphs

Wu, Cheng, Wang, Chaokun, Xu, Jingcao, Fang, Ziwei, Gu, Tiankai, Wang, Changping, Song, Yang, Zheng, Kai, Wang, Xiaowei, Zhou, Guorui

arXiv.org Artificial IntelligenceMay-22-2023

Recommender systems are able to learn user preferences based on user and item representations via their historical behaviors. To improve representation learning, recent recommendation models start leveraging information from various behavior types exhibited by users. In real-world scenarios, the user behavioral graph is not only multiplex but also dynamic, i.e., the graph evolves rapidly over time, with various types of nodes and edges added or deleted, which causes the Neighborhood Disturbance. Nevertheless, most existing methods neglect such streaming dynamics and thus need to be retrained once the graph has significantly evolved, making them unsuitable in the online learning environment. Furthermore, the Neighborhood Disturbance existing in dynamic graphs deteriorates the performance of neighbor-aggregation based graph models. To this end, we propose SUPA, a novel graph neural network for dynamic multiplex heterogeneous graphs. Compared to neighbor-aggregation architecture, SUPA develops a sample-update-propagate architecture to alleviate neighborhood disturbance. Specifically, for each new edge, SUPA samples an influenced subgraph, updates the representations of the two interactive nodes, and propagates the interaction information to the sampled subgraph. Furthermore, to train SUPA incrementally online, we propose InsLearn, an efficient workflow for single-pass training of large dynamic graphs. Extensive experimental results on six real-world datasets show that SUPA has a good generalization ability and is superior to sixteen state-of-the-art baseline methods. The source code is available at https://github.com/shatter15/SUPA.

data mining, information, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2305.18622

Country:

Asia > China (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Education (0.86)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Data Science > Data Mining (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback

An End-to-End Framework for Marketing Effectiveness Optimization under Budget Constraint

Yan, Ziang, Wang, Shusen, Zhou, Guorui, Lin, Jingjian, Jiang, Peng

arXiv.org Artificial IntelligenceFeb-9-2023

Online platforms often incentivize consumers to improve user engagement and platform revenue. Since different consumers might respond differently to incentives, individual-level budget allocation is an essential task in marketing campaigns. Recent advances in this field often address the budget allocation problem using a two-stage paradigm: the first stage estimates the individual-level treatment effects using causal inference algorithms, and the second stage invokes integer programming techniques to find the optimal budget allocation solution. Since the objectives of these two stages might not be perfectly aligned, such a two-stage paradigm could hurt the overall marketing effectiveness. In this paper, we propose a novel end-to-end framework to directly optimize the business goal under budget constraints. Our core idea is to construct a regularizer to represent the marketing goal and optimize it efficiently using gradient estimation techniques. As such, the obtained models can learn to maximize the marketing goal directly and precisely. We extensively evaluate our proposed method in both offline and online experiments, and experimental results demonstrate that our method outperforms current state-of-the-art methods. Our proposed method is currently deployed to allocate marketing budgets for hundreds of millions of users on a short video platform and achieves significant business goal improvements. Our code will be publicly available.

artificial intelligence, gradient, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2302.04477

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Data Science (0.93)

Add feedback

CAN: Revisiting Feature Co-Action for Click-Through Rate Prediction

Zhou, Guorui, Bian, Weijie, Wu, Kailun, Ren, Lejian, Pi, Qi, Zhang, Yujing, Xiao, Can, Sheng, Xiang-Rong, Mou, Na, Luo, Xinchen, Zhang, Chi, Qiao, Xianjie, Xiang, Shiming, Gai, Kun, Zhu, Xiaoqiang, Xu, Jian

arXiv.org Machine LearningNov-11-2020

Inspired by the success of deep learning, recent industrial Click-Through Rate (CTR) prediction models have made the transition from traditional shallow approaches to deep approaches. Deep Neural Networks (DNNs) are known for its ability to learn non-linear interactions from raw feature automatically, however, the non-linear feature interaction is learned in an implicit manner. The non-linear interaction may be hard to capture and explicitly model the \textit{co-action} of raw feature is beneficial for CTR prediction. \textit{Co-action} refers to the collective effects of features toward final prediction. In this paper, we argue that current CTR models do not fully explore the potential of feature co-action. We conduct experiments and show that the effect of feature co-action is underestimated seriously. Motivated by our observation, we propose feature Co-Action Network (CAN) to explore the potential of feature co-action. The proposed model can efficiently and effectively capture the feature co-action, which improves the model performance while reduce the storage and computation consumption. Experiment results on public and industrial datasets show that CAN outperforms state-of-the-art CTR models by a large margin. Up to now, CAN has been deployed in the Alibaba display advertisement system, obtaining averaging 12\% improvement on CTR and 8\% on RPM.

coaction, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

2011.05625

Country:

Europe (0.69)
North America > Canada (0.28)
North America > United States > Hawaii (0.14)

Genre: Research Report (0.82)

Industry: Marketing (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback