reservation price
Pareto-Optimal Learning-Augmented Algorithms for Online Conversion Problems
This paper leverages machine-learned predictions to design competitive algorithms for online conversion problems with the goal of improving the competitive ratio when predictions are accurate (i.e., consistency), while also guaranteeing a worstcase competitive ratio regardless of the prediction quality (i.e., robustness). We unify the algorithmic design of both integral and fractional conversion problems, which are also known as the 1-max-search and one-way trading problems, into a class of online threshold-based algorithms (OTA). By incorporating predictions into design of OTA, we achieve the Pareto-optimal trade-off of consistency and robustness, i.e., no online algorithm can achieve a better consistency guarantee given for a robustness guarantee. We demonstrate the performance of OTA using numerical experiments on Bitcoin conversion.
A Fairness-Driven Method for Learning Human-Compatible Negotiation Strategies
Despite recent advancements in AI and NLP, negotiation remains a difficult domain for AI agents. Traditional game theoretic approaches that have worked well for two-player zero-sum games struggle in the context of negotiation due to their inability to learn human-compatible strategies. On the other hand, approaches that only use human data tend to be domain-specific and lack the theoretical guarantees provided by strategies grounded in game theory. Motivated by the notion of fairness as a criterion for optimality in general sum games, we propose a negotiation framework called FDHC which incorporates fairness into both the reward design and search to learn human-compatible negotiation strategies. Our method includes a novel, RL+search technique called LGM-Zero which leverages a pre-trained language model to retrieve human-compatible offers from large action spaces. Our results show that our method is able to achieve more egalitarian negotiation outcomes and improve negotiation quality.
Make me an Offer: Forward and Reverse Auctioning Problems in the Tourism Industry
Christou, Ioannis T., Doukas, Dimitris, Skouri, Konstantina, Meletiou, Gerasimos
Most tourist destinations are facing regular and consistent seasonality with significant economic and social impacts. This phenomenon is more pronounced in the post-covid era, where demand for travel has increased but unevenly among different geographic areas. To counter these problems that both customers and hoteliers are facing, we have developed two auctioning systems that allow hoteliers of lower popularity tier areas or during low season periods to auction their rooms in what we call a forward auction model, and also allows customers to initiate a bidding process whereby hoteliers in an area may make offers to the customer for their rooms, in what constitutes a reverse auction model initiated by the customer, similar to the bidding concept of priceline.com. We develop mathematical programming models that define explicitly both types of auctions, and show that in each type, there are significant benefits to be gained both on the side of the hotelier as well as on the side of the customer. We discuss algorithmic techniques for the approximate solution of these optimization problems, and present results using exact optimization solvers to solve them to guaranteed optimality. These techniques could be beneficial to both customer and hotelier reducing seasonality during middle and low season and providing the customer with attractive offers.
On learning agent-based models from data
Monti, Corrado, Pangallo, Marco, Morales, Gianmarco De Francisci, Bonchi, Francesco
Agent-Based Models (ABMs) are used in several fields to study the evolution of complex systems from micro-level assumptions. However, ABMs typically can not estimate agent-specific (or "micro") variables: this is a major limitation which prevents ABMs from harnessing micro-level data availability and which greatly limits their predictive power. In this paper, we propose a protocol to learn the latent micro-variables of an ABM from data. The first step of our protocol is to reduce an ABM to a probabilistic model, characterized by a computationally tractable likelihood. This reduction follows two general design principles: balance of stochasticity and data availability, and replacement of unobservable discrete choices with differentiable approximations. Then, our protocol proceeds by maximizing the likelihood of the latent variables via a gradient-based expectation maximization algorithm. We demonstrate our protocol by applying it to an ABM of the housing market, in which agents with different incomes bid higher prices to live in high-income neighborhoods. We demonstrate that the obtained model allows accurate estimates of the latent variables, while preserving the general behavior of the ABM. We also show that our estimates can be used for out-of-sample forecasting. Our protocol can be seen as an alternative to black-box data assimilation methods, that forces the modeler to lay bare the assumptions of the model, to think about the inferential process, and to spot potential identification problems.
Online Search With Best-Price and Query-Based Predictions
Angelopoulos, Spyros, Kamali, Shahin, Zhang, Dehou
In the online (time-series) search problem, a player is presented with a sequence of prices which are revealed in an online manner. In the standard definition of the problem, for each revealed price, the player must decide irrevocably whether to accept or reject it, without knowledge of future prices (other than an upper and a lower bound on their extreme values), and the objective is to minimize the competitive ratio, namely the worst-case ratio between the maximum price in the sequence and the one selected by the player. The problem formulates several applications of decision-making in the face of uncertainty on the revealed samples. Previous work on this problem has largely assumed extreme scenarios in which either the player has almost no information about the input, or the player is provided with some powerful, and error-free advice. In this work, we study learning-augmented algorithms, in which there is a potentially erroneous prediction concerning the input. Specifically, we consider two different settings: the setting in which the prediction is related to the maximum price in the sequence, as well as the setting in which the prediction is obtained as a response to a number of binary queries. For both settings, we provide tight, or near-tight upper and lower bounds on the worst-case performance of search algorithms as a function of the prediction error. We also provide experimental results on data obtained from stock exchange markets that confirm the theoretical analysis, and explain how our techniques can be applicable to other learning-augmented applications.
When is it permissible for artificial intelligence to lie? A trust-based approach
Kim, Tae Wan, Tong, null, Lu, null, Lee, Kyusong, Cheng, Zhaoqi, Tang, Yanhan, Hooker, John
Conversational Artificial Intelligence (AI) used in industry settings can be trained to closely mimic human behaviors, including lying and deception. However, lying is often a necessary part of negotiation. To address this, we develop a normative framework for when it is ethical or unethical for a conversational AI to lie to humans, based on whether there is what we call "invitation of trust" in a particular scenario. Importantly, cultural norms play an important role in determining whether there is invitation of trust across negotiation settings, and thus an AI trained in one culture may not be generalizable to others. Moreover, individuals may have different expectations regarding the invitation of trust and propensity to lie for human vs. AI negotiators, and these expectations may vary across cultures as well. Finally, we outline how a conversational chatbot can be trained to negotiate ethically by applying autoregressive models to large dialog and negotiations datasets.
Automated Strategies for Determining Rewards for Human Work
Azaria, Amos (Bar Ilan University) | Aumann, Yonatan (Bar Ilan University) | Kraus, Sarit (Bar Ilan University)
We consider the problem of designing automated strategies for interactions with human subjects, where the humans must be rewarded for performing certain tasks of interest. We focus on settings where there is a single task that must be performed many times by different humans (e.g. answering a questionnaire), and the humans require a fee for performing the task. In such settings, our objective is to minimize the average cost for effectuating the completion of the task. We present two automated strategies for designing efficient agents for the problem, based on two different models of human behavior. The first, the Reservation Price Based Agent (RPBA), is based on the concept of a reservation price, and the second, the No Bargaining Agent (NBA), uses principles from behavioral science. The performance of the agents has been tested in extensive experiments with real human subjects, where NBA outperforms both RPBA and strategies developed by human experts.