Goto

Collaborating Authors

 coupon allocation



BCORLE( \lambda ): An Offline Reinforcement Learning and Evaluation Framework for Coupons Allocation in E-commerce Market

Neural Information Processing Systems

Coupons allocation is an important tool for enterprises to increase the activity and loyalty of users on the e-commerce market. One fundamental problem related is how to allocate coupons within a fixed budget while maximizing users' retention on the e-commerce platform. The online e-commerce environment is complicated and ever changing, so it requires the coupons allocation policy learning can quickly adapt to the changes of the company's business strategy. Unfortunately, existing studies with a huge computation overhead can hardly satisfy the requirements of real-time and fast-response in the real world. Specifically, the problem of coupons allocation within a fixed budget is usually formulated as a Lagrangian problem.


BCORLE(λ): An Offline Reinforcement Learning and Evaluation Framework for Coupons Allocation in E-commerce Market Y ang Zhang

Neural Information Processing Systems

The online e-commerce environment is complicated and ever changing, so it requires the coupons allocation policy learning can quickly adapt to the changes of the company's business strategy. Unfortunately, existing studies with a huge computation overhead can hardly satisfy the requirements of real-time and fast-response in the real world.


Order Acquisition Under Competitive Pressure: A Rapidly Adaptive Reinforcement Learning Approach for Ride-Hailing Subsidy Strategies

Shi, Fangzhou, Ke, Xiaopeng, Xiong, Xinye, Meng, Kexin, Men, Chang, Zhu, Zhengdan

arXiv.org Artificial Intelligence

The proliferation of ride-hailing aggregator platforms presents significant growth opportunities for ride-service providers by increasing order volume and gross merchandise value (GMV). On most ride-hailing aggregator platforms, service providers that offer lower fares are ranked higher in listings and, consequently, are more likely to be selected by passengers. This competitive ranking mechanism creates a strong incentive for service providers to adopt coupon strategies that lower prices to secure a greater number of orders, as order volume directly influences their long-term viability and sustainability. Thus, designing an effective coupon strategy that can dynamically adapt to market fluctuations while optimizing order acquisition under budget constraints is a critical research challenge. However, existing studies in this area remain scarce. To bridge this gap, we propose FCA-RL, a novel reinforcement learning-based subsidy strategy framework designed to rapidly adapt to competitors' pricing adjustments. Our approach integrates two key techniques: Fast Competition Adaptation (FCA), which enables swift responses to dynamic price changes, and Reinforced Lagrangian Adjustment (RLA), which ensures adherence to budget constraints while optimizing coupon decisions on new price landscape. Furthermore, we introduce RideGym, the first dedicated simulation environment tailored for ride-hailing aggregators, facilitating comprehensive evaluation and benchmarking of different pricing strategies without compromising real-world operational efficiency. Experimental results demonstrate that our proposed method consistently outperforms baseline approaches across diverse market conditions, highlighting its effectiveness in subsidy optimization for ride-hailing service providers.


BCORLE( \lambda ): An Offline Reinforcement Learning and Evaluation Framework for Coupons Allocation in E-commerce Market

Neural Information Processing Systems

Coupons allocation is an important tool for enterprises to increase the activity and loyalty of users on the e-commerce market. One fundamental problem related is how to allocate coupons within a fixed budget while maximizing users' retention on the e-commerce platform. The online e-commerce environment is complicated and ever changing, so it requires the coupons allocation policy learning can quickly adapt to the changes of the company's business strategy. Unfortunately, existing studies with a huge computation overhead can hardly satisfy the requirements of real-time and fast-response in the real world. Specifically, the problem of coupons allocation within a fixed budget is usually formulated as a Lagrangian problem.


Balancing Immediate Revenue and Future Off-Policy Evaluation in Coupon Allocation

Nishimura, Naoki, Kobayashi, Ken, Nakata, Kazuhide

arXiv.org Artificial Intelligence

Coupon allocation drives customer purchases and boosts revenue. However, it presents a fundamental trade-off between exploiting the current optimal policy to maximize immediate revenue and exploring alternative policies to collect data for future policy improvement via off-policy evaluation (OPE). While online A/B testing can validate new policies, it risks compromising short-term revenue. Conversely, relying solely on an exploitative policy hinders the ability to reliably estimate and enhance future policies. To balance this trade-off, we propose a novel approach that combines a model-based revenue maximization policy and a randomized exploration policy for data collection. Our framework enables flexibly adjusting the mixture ratio between these two policies to optimize the balance between short-term revenue and future policy improvement. We formulate the problem of determining the optimal mixture ratio between a model-based revenue maximization policy and a randomized exploration policy for data collection. We empirically verified the effectiveness of the proposed mixed policy using both synthetic and real-world data. Our main contributions are: (1) Demonstrating a mixed policy combining deterministic and probabilistic policies, flexibly adjusting the data collection vs. revenue trade-off. (2) Formulating the optimal mixture ratio problem as multi-objective optimization, enabling quantitative evaluation of this trade-off. By optimizing the mixture ratio, businesses can maximize revenue while ensuring reliable future OPE and policy improvement. This framework is applicable in any context where the exploration-exploitation trade-off is relevant.