Goto

Collaborating Authors

 Optimization


Safe Online Bid Optimization with Return-On-Investment and Budget Constraints subject to Uncertainty

arXiv.org Artificial Intelligence

In online marketing, the advertisers' goal is usually a tradeoff between achieving high volumes and high profitability. The companies' business units customarily address this tradeoff by maximizing the volumes while guaranteeing a lower bound to the Return On Investment (ROI). This paper investigates combinatorial bandit algorithms for the bid optimization of advertising campaigns subject to uncertain budget and ROI constraints. We study the nature of both the optimization and learning problems. In particular, when focusing on the optimization problem without uncertainty, we show that it is inapproximable within any factor unless P=NP, and we provide a pseudo-polynomial-time algorithm that achieves an optimal solution. When considering uncertainty, we prove that no online learning algorithm can violate the (ROI or budget) constraints during the learning process a sublinear number of times while guaranteeing a sublinear pseudo-regret. Thus, we provide an algorithm, namely GCB, guaranteeing sublinear regret at the cost of a potentially linear number of constraints violations. We also design its safe version, namely GCB_{safe}, guaranteeing w.h.p. a constant upper bound on the number of constraints violations at the cost of a linear pseudo-regret. More interestingly, we provide an algorithm, namely GCB_{safe}(\psi,\phi), guaranteeing both sublinear pseudo-regret and safety w.h.p. at the cost of accepting tolerances \psi and \phi in the satisfaction of the ROI and budget constraints, respectively. This algorithm actually mitigates the risks due to the constraints violations without precluding the convergence to the optimal solution. Finally, we experimentally compare our algorithms in terms of pseudo-regret/constraint-violation tradeoff in settings generated from real-world data, showing the importance of adopting safety constraints in practice and the effectiveness of our algorithms.


Lifelong Dynamic Optimization for Self-Adaptive Systems: Fact or Fiction?

arXiv.org Artificial Intelligence

When faced with changing environment, highly configurable software systems need to dynamically search for promising adaptation plan that keeps the best possible performance, e.g., higher throughput or smaller latency -- a typical planning problem for self-adaptive systems (SASs). However, given the rugged and complex search landscape with multiple local optima, such a SAS planning is challenging especially in dynamic environments. In this paper, we propose LiDOS, a lifelong dynamic optimization framework for SAS planning. What makes LiDOS unique is that to handle the "dynamic", we formulate the SAS planning as a multi-modal optimization problem, aiming to preserve the useful information for better dealing with the local optima issue under dynamic environment changes. This differs from existing planners in that the "dynamic" is not explicitly handled during the search process in planning. As such, the search and planning in LiDOS run continuously over the lifetime of SAS, terminating only when it is taken offline or the search space has been covered under an environment. Experimental results on three real-world SASs show that the concept of explicitly handling dynamic as part of the search in the SAS planning is effective, as LiDOS outperforms its stationary counterpart overall with up to 10x improvement. It also achieves better results in general over state-of-the-art planners and with 1.4x to 10x speedup on generating promising adaptation plans.


Frequent Itemset-driven Search for Finding Minimum Node Separators in Complex Networks

arXiv.org Artificial Intelligence

Finding an optimal set of critical nodes in a complex network has been a long-standing problem in the fields of both artificial intelligence and operations research. Potential applications include epidemic control, network security, carbon emission monitoring, emergence response, drug design, and vulnerability assessment. In this work, we consider the problem of finding a minimal node separator whose removal separates a graph into multiple different connected components with fewer than a limited number of vertices in each component. To solve it, we propose a frequent itemset-driven search approach, which integrates the concept of frequent itemset mining in data mining into the well-known memetic search framework. Starting from a high-quality population built by the solution construction and population repair procedures, it iteratively employs the frequent itemset recombination operator (to generate promising offspring solution based on itemsets that frequently occur in high-quality solutions), tabu search-based simulated annealing (to find high-quality local optima), population repair procedure (to modify the population), and rank-based population management strategy (to guarantee a healthy population). Extensive evaluations on 50 widely used benchmark instances show that it significantly outperforms state-of-the-art algorithms. In particular, it discovers 29 new upper bounds and matches 18 previous best-known bounds. Finally, experimental analyses are performed to confirm the effectiveness of key algorithmic modules of the proposed method.


Low Regret Binary Sampling Method for Efficient Global Optimization of Univariate Functions

arXiv.org Machine Learning

In this work, we propose a computationally efficient algorithm for the problem of global optimization in univariate loss functions. For the performance evaluation, we study the cumulative regret of the algorithm instead of the simple regret between our best query and the optimal value of the objective function. Although our approach has similar regret results with the traditional lower-bounding algorithms such as the Piyavskii-Shubert method for the Lipschitz continuous or Lipschitz smooth functions, it has a major computational cost advantage. In Piyavskii-Shubert method, for certain types of functions, the query points may be hard to determine (as they are solutions to additional optimization problems). However, this issue is circumvented in our binary sampling approach, where the sampling set is predetermined irrespective of the function characteristics. For a search space of $[0,1]$, our approach has at most $L\log (3T)$ and $2.25H$ regret for $L$-Lipschitz continuous and $H$-Lipschitz smooth functions respectively. We also analytically extend our results for a broader class of functions that covers more complex regularity conditions.


Learn Quasi-stationary Distributions of Finite State Markov Chain

arXiv.org Artificial Intelligence

We propose a reinforcement learning (RL) approach to compute the expression of quasi-stationary distribution. Based on the fixed-point formulation of quasi-stationary distribution, we minimize the KL-divergence of two Markovian path distributions induced by the candidate distribution and the true target distribution. To solve this challenging minimization problem by gradient descent, we apply the reinforcement learning technique by introducing the reward and value functions. We derive the corresponding policy gradient theorem and design an actor-critic algorithm to learn the optimal solution and the value function. The numerical examples of finite state Markov chain are tested to demonstrate the new method.


Learning to Approximate: Auto Direction Vector Set Generation for Hypervolume Contribution Approximation

arXiv.org Artificial Intelligence

Hypervolume contribution is an important concept in evolutionary multi-objective optimization (EMO). It involves in hypervolume-based EMO algorithms and hypervolume subset selection algorithms. Its main drawback is that it is computationally expensive in high-dimensional spaces, which limits its applicability to many-objective optimization. Recently, an R2 indicator variant (i.e., $R_2^{\text{HVC}}$ indicator) is proposed to approximate the hypervolume contribution. The $R_2^{\text{HVC}}$ indicator uses line segments along a number of direction vectors for hypervolume contribution approximation. It has been shown that different direction vector sets lead to different approximation quality. In this paper, we propose \textit{Learning to Approximate (LtA)}, a direction vector set generation method for the $R_2^{\text{HVC}}$ indicator. The direction vector set is automatically learned from training data. The learned direction vector set can then be used in the $R_2^{\text{HVC}}$ indicator to improve its approximation quality. The usefulness of the proposed LtA method is examined by comparing it with other commonly-used direction vector set generation methods for the $R_2^{\text{HVC}}$ indicator. Experimental results suggest the superiority of LtA over the other methods for generating high quality direction vector sets.


Search and Score-Based Waterfall Auction Optimization

arXiv.org Artificial Intelligence

Online advertising is a major source of income for many online companies. One common approach is to sell online advertisements via waterfall auctions, where a publisher makes sequential price offers to ad networks. The publisher controls the order and prices of the waterfall and by that aims to maximize his revenue. In this work, we propose a methodology to learn a waterfall strategy from historical data by wisely searching in the space of possible waterfalls and selecting the one leading to the highest revenue. The contribution of this work is twofold; First, we propose a novel method to estimate the valuation distribution of each user with respect to each ad network. Second, we utilize the valuation matrix to score our candidate waterfalls as part of a procedure that iteratively searches in local neighborhoods. Our framework guarantees that the waterfall revenue improves between iterations until converging to a local optimum. Real-world demonstrations are provided to show that the proposed method improves the total revenue of real-world waterfalls compared to manual expert optimization. Finally, the code and the data are available here.


Railway Operation Rescheduling System via Dynamic Simulation and Reinforcement Learning

arXiv.org Artificial Intelligence

The number of railway service disruptions has been increasing owing to intensification of natural disasters. In addition, abrupt changes in social situations such as the COVID-19 pandemic require railway companies to modify the traffic schedule frequently. Therefore, automatic support for optimal scheduling is anticipated. In this study, an automatic railway scheduling system is presented. The system leverages reinforcement learning and a dynamic simulator that can simulate the railway traffic and passenger flow of a whole line. The proposed system enables rapid generation of the traffic schedule of a whole line because the optimization process is conducted in advance as the training. The system is evaluated using an interruption scenario, and the results demonstrate that the system can generate optimized schedules of the whole line in a few minutes.


Exploit Customer Life-time Value with Memoryless Experiments

arXiv.org Artificial Intelligence

As a measure of the long-term contribution produced by customers in a service or product relationship, life-time value, or LTV, can more comprehensively find the optimal strategy for service delivery. However, it is challenging to accurately abstract the LTV scene, model it reasonably, and find the optimal solution. The current theories either cannot precisely express LTV because of the single modeling structure, or there is no efficient solution. We propose a general LTV modeling method, which solves the problem that customers' long-term contribution is difficult to quantify while existing methods, such as modeling the click-through rate, only pursue the short-term contribution. At the same time, we also propose a fast dynamic programming solution based on a mutated bisection method and the memoryless repeated experiments assumption. The model and method can be applied to different service scenarios, such as the recommendation system. Experiments on real-world datasets confirm the effectiveness of the proposed model and optimization method. In addition, this whole LTV structure was deployed at a large E-commerce mobile phone application, where it managed to select optimal push message sending time and achieved a 10\% LTV improvement.


Adaptive Energy Management for Self-Sustainable Wearables in Mobile Health

arXiv.org Artificial Intelligence

Wearable devices that integrate multiple sensors, processors, and communication technologies have the potential to transform mobile health for remote monitoring of health parameters. However, the small form factor of the wearable devices limits the battery size and operating lifetime. As a result, the devices require frequent recharging, which has limited their widespread adoption. Energy harvesting has emerged as an effective method towards sustainable operation of wearable devices. Unfortunately, energy harvesting alone is not sufficient to fulfill the energy requirements of wearable devices. This paper studies the novel problem of adaptive energy management towards the goal of self-sustainable wearables by using harvested energy to supplement the battery energy and to reduce manual recharging by users. To solve this problem, we propose a principled algorithm referred as AdaEM. There are two key ideas behind AdaEM. First, it uses machine learning (ML) methods to learn predictive models of user activity and energy usage patterns. These models allow us to estimate the potential of energy harvesting in a day as a function of the user activities. Second, it reasons about the uncertainty in predictions and estimations from the ML models to optimize the energy management decisions using a dynamic robust optimization (DyRO) formulation. We propose a light-weight solution for DyRO to meet the practical needs of deployment. We validate the AdaEM approach on a wearable device prototype consisting of solar and motion energy harvesting using real-world data of user activities. Experiments show that AdaEM achieves solutions that are within 5% of the optimal with less than 0.005% execution time and energy overhead.