cltv
Enhancing Offline Reinforcement Learning with Curriculum Learning-Based Trajectory Valuation
Abolfazli, Amir, Song, Zekun, Anand, Avishek, Nejdl, Wolfgang
The success of deep reinforcement learning (DRL) relies on the availability and quality of training data, often requiring extensive interactions with specific environments. In many real-world scenarios, where data collection is costly and risky, offline reinforcement learning (RL) offers a solution by utilizing data collected by domain experts and searching for a batch-constrained optimal policy. This approach is further augmented by incorporating external data sources, expanding the range and diversity of data collection possibilities. However, existing offline RL methods often struggle with challenges posed by non-matching data from these external sources. In this work, we specifically address the problem of source-target domain mismatch in scenarios involving mixed datasets, characterized by a predominance of source data generated from random or suboptimal policies and a limited amount of target data generated from higher-quality policies. To tackle this problem, we introduce Transition Scoring (TS), a novel method that assigns scores to transitions based on their similarity to the target domain, and propose Curriculum Learning-Based Trajectory Valuation (CLTV), which effectively leverages these transition scores to identify and prioritize high-quality trajectories through a curriculum learning approach. Our extensive experiments across various offline RL methods and MuJoCo environments, complemented by rigorous theoretical analysis, demonstrate that CLTV enhances the overall performance and transferability of policies learned by offline RL algorithms.
OptDist: Learning Optimal Distribution for Customer Lifetime Value Prediction
Weng, Yunpeng, Tang, Xing, Xu, Zhenhao, Lyu, Fuyuan, Liu, Dugang, Sun, Zexu, He, Xiuqiang
Customer Lifetime Value (CLTV) prediction is a critical task in business applications. Accurately predicting CLTV is challenging in real-world business scenarios, as the distribution of CLTV is complex and mutable. Firstly, there is a large number of users without any consumption consisting of a long-tailed part that is too complex to fit. Secondly, the small set of high-value users spent orders of magnitude more than a typical user leading to a wide range of the CLTV distribution which is hard to capture in a single distribution. Existing approaches for CLTV estimation either assume a prior probability distribution and fit a single group of distribution-related parameters for all samples, or directly learn from the posterior distribution with manually predefined buckets in a heuristic manner. However, all these methods fail to handle complex and mutable distributions. In this paper, we propose a novel optimal distribution selection model OptDist for CLTV prediction, which utilizes an adaptive optimal sub-distribution selection mechanism to improve the accuracy of complex distribution modeling. Specifically, OptDist trains several candidate sub-distribution networks in the distribution learning module (DLM) for modeling the probability distribution of CLTV. Then, a distribution selection module (DSM) is proposed to select the sub-distribution for each sample, thus making the selection automatically and adaptively. Besides, we design an alignment mechanism that connects both modules, which effectively guides the optimization. We conduct extensive experiments on both two public and one private dataset to verify that OptDist outperforms state-of-the-art baselines. Furthermore, OptDist has been deployed on a large-scale financial platform for customer acquisition marketing campaigns and the online experiments also demonstrate the effectiveness of OptDist.
ALICE: Combining Feature Selection and Inter-Rater Agreeability for Machine Learning Insights
Anasashvili, Bachana, Jeleskovic, Vahidin
The use of Machine Learning models for decision-making has become the new norm not only in tech but any business field imaginable, covering any possible task at hand be it search engine recommendations, customer churn prediction, credit risk scoring, energy load forecasting, or the deployment of personalized AI assistants. This comes at a time when developing ML models has become increasingly easier with the rise of open-source, free and user-friendly Python libraries such as Keras, scikit-learn, PyTorch and as generative AI-based conversational chatbots such as ChatGPT, Gemini and Claude that can provide coding assistance -- if not ready-made code for modeling -- are evolving rapidly. Such developments yet again beg the question of interpretability in machine learning, which has been formulated in various ways in literature and been offered multiple proposed solutions such as exploring causality (see Section 2.1), explainability (see Section 2.2) or abandoning black box ML models altogether. But to make a philosophical argument, it is hard to see the benefits of highly model or domain-specific, post-hoc, or complex solutions to obtain insights into the inner-doings of machine learning models when the modeling task itself is growing ever more accessible to laypeople. Common thought on categorizing ML models in this regard would argue that parametric models descending from the fields of statistics and econometrics such as Linear or Logistic Regression are by nature more interpretable than their data-driven and non-parametric counterparts such as tree-based models or neural networks.
Integrated Approach of RFM, Clustering, CLTV & Machine Learning Algorithms for Forecasting
CLTV is a customer relationship management (CRM) issue with an enterprise approach to understanding and influencing customer behavior through meaningful communication to improve customer acquisition, customer retention, customer loyalty, and customer profitability. The whole idea is that, business wants to predict the average amount of $$ customers will spend on the business over the entire life of relationship. Although statistical methods can be very powerful, but these methods make several stringent assumptions on the types of data and their distribution, and typically can only handle a limited number of variables. Regression-based methods are usually based on a fixed-form equation, and assume a single best solution, which means that we can compare only a few alternative solutions manually. Further, when the models are applied to real data, the key assumptions of the methods are often violated.
Customer Lifetime Value Prediction Using Embeddings
Chamberlain, Benjamin Paul, Cardoso, Angelo, Liu, C. H. Bryan, Pagliari, Roberto, Deisenroth, Marc Peter
We describe the Customer LifeTime Value (CLTV) prediction system deployed at ASOS.com, a global online fashion retailer. CLTV prediction is an important problem in e-commerce where an accurate estimate of future value allows retailers to effectively allocate marketing spend, identify and nurture high value customers and mitigate exposure to losses. The system at ASOS provides daily estimates of the future value of every customer and is one of the cornerstones of the personalised shopping experience. The state of the art in this domain uses large numbers of handcrafted features and ensemble regressors to forecast value, predict churn and evaluate customer loyalty. Recently, domains including language, vision and speech have shown dramatic advances by replacing handcrafted features with features that are learned automatically from data. We detail the system deployed at ASOS and show that learning feature representations is a promising extension to the state of the art in CLTV modelling. We propose a novel way to generate embeddings of customers, which addresses the issue of the ever changing product catalogue and obtain a significant improvement over an exhaustive set of handcrafted features.