opportunity cost
Learning When to Quit in Sales Conversations
Manzoor, Emaad, Ascarza, Eva, Netzer, Oded
Salespeople frequently face the dynamic screening decision of whether to persist in a conversation or abandon it to pursue the next lead. Yet, little is known about how these decisions are made, whether they are efficient, or how to improve them. We study these decisions in the context of high-volume outbound sales where leads are ample, but time is scarce and failure is common. We formalize the dynamic screening decision as an optimal stopping problem and develop a generative language model-based sequential decision agent - a stopping agent - that learns whether and when to quit conversations by imitating a retrospectively-inferred optimal stopping policy. Our approach handles high-dimensional textual states, scales to large language models, and works with both open-source and proprietary language models. When applied to calls from a large European telecommunications firm, our stopping agent reduces the time spent on failed calls by 54% while preserving nearly all sales; reallocating the time saved increases expected sales by up to 37%. Upon examining the linguistic cues that drive salespeople's quitting decisions, we find that they tend to overweight a few salient expressions of consumer disinterest and mispredict call failure risk, suggesting cognitive bounds on their ability to make real-time conversational decisions. Our findings highlight the potential of artificial intelligence algorithms to correct cognitively-bounded human decisions and improve salesforce efficiency.
From Individual Learning to Market Equilibrium: Correcting Structural and Parametric Biases in RL Simulations of Economic Models
The application of Reinforcement Learning (RL) to economic modeling reveals a fundamental conflict between the assumptions of equilibrium theory and the emergent behavior of learning agents. While canonical economic models assume atomistic agents act as `takers' of aggregate market conditions, a naive single-agent RL simulation incentivizes the agent to become a `manipulator' of its environment. This paper first demonstrates this discrepancy within a search-and-matching model with concave production, showing that a standard RL agent learns a non-equilibrium, monopsonistic policy. Additionally, we identify a parametric bias arising from the mismatch between economic discounting and RL's treatment of intertemporal costs. To address both issues, we propose a calibrated Mean-Field Reinforcement Learning framework that embeds a representative agent in a fixed macroeconomic field and adjusts the cost function to reflect economic opportunity costs. Our iterative algorithm converges to a self-consistent fixed point where the agent's policy aligns with the competitive equilibrium. This approach provides a tractable and theoretically sound methodology for modeling learning agents in economic systems within the broader domain of computational social science.
Incentive-based Platoon Formation: Optimizing the Personal Benefit for Drivers
Heinovski, Julian, Ergenç, Doğanalp, Thommes, Kirsten, Dressler, Falko
Platooning or cooperative adaptive cruise control (CACC) has been investigated for decades, but debate about its lasting impact is still ongoing. Even though platooning benefits and platoon formation are rather well understood for trucks, this is less clear for passenger cars, which have a higher heterogeneity in trips and drivers' preferences. Most importantly, it remains unclear how to form platoons of passenger cars in order to optimize the personal benefit for the individual driver. To this end, in this paper, we propose a novel platoon formation algorithm that optimizes the personal benefit for drivers of individual passenger cars. For computing vehicle-to-platoon assignments, the algorithm utilizes a new metric that we propose to evaluate the personal benefits of various driving systems, including platooning. By combining fuel and travel time costs into a single monetary value, drivers can estimate overall trip costs according to a personal monetary value for time spent. This provides an intuitive way for drivers to understand and compare the benefits of driving systems like human driving, adaptive cruise control (ACC), and, of course, platooning. Unlike previous similarity-based methods, our proposed algorithm forms platoons only when beneficial for the driver, rather than for the sake of platooning only. Results of a large-scale simulation study demonstrate that our proposed algorithm outperforms normal ACC as well as previous similarity-based platooning approaches by balancing fuel savings and travel time, independent of traffic and drivers' time cost.
Dynamic Revenue Sharing
Santiago Balseiro, Max Lin, Vahab Mirrokni, Renato Leme, IIIS Song Zuo
Many online platforms act as intermediaries between a seller and a set of buyers. Examples of such settings include online retailers (such as Ebay) selling items on behalf of sellers to buyers, or advertising exchanges (such as AdX) selling pageviews on behalf of publishers to advertisers. In such settings, revenue sharing is a central part of running such a marketplace for the intermediary, and fixedpercentage revenue sharing schemes are often used to split the revenue among the platform and the sellers. In particular, such revenue sharing schemes require the platform to (i) take at most a constant fraction α of the revenue from auctions and (ii) pay the seller at least the seller declared opportunity cost c for each item sold. A straightforward way to satisfy the constraints is to set a reserve price at c/(1 α) for each item, but it is not the optimal solution on maximizing the profit of the intermediary.
MCTS Based Dispatch of Autonomous Vehicles under Operational Constraints for Continuous Transportation
Tomy, Milan, Seiler, Konstantin M., Hill, Andrew J.
Continuous transportation of material in the mining industry is achieved by the dispatch of autonomous haul-trucks with discrete haulage capacities. Recently, Monte Carlo Tree Search (MCTS) was successfully deployed in tackling challenges of long-run optimality, scalability and adaptability in haul-truck dispatch. Typically, operational constraints imposed on the mine site are satisfied by heuristic controllers or human operators independent of the dispatch planning. This article incorporates operational constraint satisfaction into the dispatch planning by utilising the MCTS based dispatch planner Flow-Achieving Scheduling Tree (FAST). Operational constraint violation and satisfaction are modelled as opportunity costs in the combinatorial optimisation problem of dispatch. Explicit cost formulations are avoided by utilising MCTS generator models to derive opportunity costs. Experimental studies with four types of operational constraints demonstrate the success of utilising opportunity costs for constraint satisfaction, and the effectiveness of integrating constraints into dispatch planning.
Constructing Data Transaction Chains Based on Opportunity Cost Exploration
Liu, Jie, Feng, Tao, Jiang, Yan, Wang, Peizheng, Wu, Chao
Data trading is increasingly gaining attention. However, the inherent replicability and privacy concerns of data make it challenging to directly apply traditional trading theories to data markets. This paper compares data trading markets with traditional ones, focusing particularly on how the replicability and privacy of data impact data markets. We discuss how data's replicability fundamentally alters the concept of opportunity cost in traditional microeconomics within the context of data markets. Additionally, we explore how to leverage this change to maximize benefits without compromising data privacy. This paper outlines the constraints for data circulation within the privacy domain chain and presents a model that maximizes data's value under these constraints. Specific application scenarios are provided, and experiments demonstrate the solvability of this model.
The New Era of Dynamic Pricing: Synergizing Supervised Learning and Quadratic Programming
Bramao, Gustavo, Tarygin, Ilia
Pricing strategy is a cornerstone for businesses across various sectors, profoundly influencing their success and market position. This strategy intricately balances multiple factors, including supply and demand dynamics, competitor pricing, brand positioning, perceived value, and overarching business strategies. Despite its critical importance, many companies still rely on traditional, manual approaches to pricing. These methods often depend on the intuition and experience of domain experts, supplemented to some extent by data-driven insights. However, a paradigm shift is emerging in this domain, led by more innovative companies. For instance, companies like Lyft have revolutionized their approach to pricing. By leveraging advanced reinforcement learning techniques, they have managed to automate their pricing policies effectively (Qin et al., 2022).
Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches
Liu, Yu, Wan, Runzhe, McQueen, James, Hains, Doug, Gu, Jinxiang, Song, Rui
The selection of the assumed effect size (AES) critically determines the duration of an experiment, and hence its accuracy and efficiency. Traditionally, experimenters determine AES based on domain knowledge. However, this method becomes impractical for online experimentation services managing numerous experiments, and a more automated approach is hence of great demand. We initiate the study of data-driven AES selection in for online experimentation services by introducing two solutions. The first employs a three-layer Gaussian Mixture Model considering the heteroskedasticity across experiments, and it seeks to estimate the true expected effect size among positive experiments. The second method, grounded in utility theory, aims to determine the optimal effect size by striking a balance between the experiment's cost and the precision of decision-making. Through comparisons with baseline methods using both simulated and real data, we showcase the superior performance of the proposed approaches.
Strategic Coalition for Data Pricing in IoT Data Markets
Pandey, Shashi Raj, Pinson, Pierre, Popovski, Petar
This paper considers a market for trading Internet of Things (IoT) data that is used to train machine learning models. The data, either raw or processed, is supplied to the market platform through a network and the price of such data is controlled based on the value it brings to the machine learning model. We explore the correlation property of data in a game-theoretical setting to eventually derive a simplified distributed solution for a data trading mechanism that emphasizes the mutual benefit of devices and the market. The key proposal is an efficient algorithm for markets that jointly addresses the challenges of availability and heterogeneity in participation, as well as the transfer of trust and the economic value of data exchange in IoT networks. The proposed approach establishes the data market by reinforcing collaboration opportunities between device with correlated data to avoid information leakage. Therein, we develop a network-wide optimization problem that maximizes the social value of coalition among the IoT devices of similar data types; at the same time, it minimizes the cost due to network externalities, i.e., the impact of information leakage due to data correlation, as well as the opportunity costs. Finally, we reveal the structure of the formulated problem as a distributed coalition game and solve it following the simplified split-and-merge algorithm. Simulation results show the efficacy of our proposed mechanism design toward a trusted IoT data market, with up to 32.72% gain in the average payoff for each seller.