Goto

Collaborating Authors

 ad impression


Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards

Neural Information Processing Systems

Incrementality, which measures the causal effect of showing an ad to a potential customer (e.g. a user in an internet platform) versus not, is a central object for advertisers in online advertising platforms. This paper investigates the problem of how an advertiser can learn to optimize the bidding sequence in an online manner without knowing the incrementality parameters in advance. We formulate the offline version of this problem as a specially structured episodic Markov Decision Process (MDP) and then, for its online learning counterpart, propose a novel reinforcement learning (RL) algorithm with regret at most eO(H2 T), which depends on the number of rounds H and number of episodes T, but does not depend on the number of actions (i.e., possible bids). A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality is mixed and delayed. To handle this difficulty we propose and analyze a novel pairwise moment-matching algorithm to learn the conversion incrementality, which we believe is of independent interest.


0ee633a6ade45eab4276352b3ee79c7a-Paper-Conference.pdf

Neural Information Processing Systems

A fundamental difference between our learning problem from standard RL problems is that the realized reward feedback from conversion incrementality ismixed and delayed.


Learning to Bid in Non-Stationary Repeated First-Price Auctions

arXiv.org Machine Learning

First-price auctions have recently gained significant traction in digital advertising markets, exemplified by Google's transition from second-price to first-price auctions. Unlike in second-price auctions, where bidding one's private valuation is a dominant strategy, determining an optimal bidding strategy in first-price auctions is more complex. From a learning perspective, the learner (a specific bidder) can interact with the environment (other bidders) sequentially to infer their behaviors. Existing research often assumes specific environmental conditions and benchmarks performance against the best fixed policy (static benchmark). While this approach ensures strong learning guarantees, the static benchmark can deviate significantly from the optimal strategy in environments with even mild non-stationarity. To address such scenarios, a dynamic benchmark, which represents the sum of the best possible rewards at each time step, offers a more suitable objective. However, achieving no-regret learning with respect to the dynamic benchmark requires additional constraints. By inspecting reward functions in online first-price auctions, we introduce two metrics to quantify the regularity of the bidding sequence, which serve as measures of non-stationarity. We provide a minimax-optimal characterization of the dynamic regret when either of these metrics is sub-linear in the time horizon.


Improving Real-Time Bidding in Online Advertising Using Markov Decision Processes and Machine Learning Techniques

arXiv.org Artificial Intelligence

Real-time bidding has emerged as an effective online advertising technique. With real-time bidding, advertisers can position ads per impression, enabling them to optimise ad campaigns by targeting specific audiences in real-time. This paper proposes a novel method for real-time bidding that combines deep learning and reinforcement learning techniques to enhance the efficiency and precision of the bidding process. In particular, the proposed method employs a deep neural network to predict auction details and market prices and a reinforcement learning algorithm to determine the optimal bid price. The model is trained using historical data from the iPinYou dataset and compared to cutting-edge real-time bidding algorithms. The outcomes demonstrate that the proposed method is preferable regarding cost-effectiveness and precision. In addition, the study investigates the influence of various model parameters on the performance of the proposed algorithm. It offers insights into the efficacy of the combined deep learning and reinforcement learning approach for real-time bidding. This study contributes to advancing techniques and offers a promising direction for future research.


How Does VeraViews Use Artificial Intelligence & Machine Learning?

#artificialintelligence

Ad fraud, which includes bot traffic, click fraud, and many other fraudulent activities, is a pervasive problem that has been plaguing the advertising industry for years. Online advertising fraud resulted in the loss of more than $80 billion in ad spend in 2022 alone, and it's projected to grow to over $100 billion in 2023. However, recent advancements in artificial intelligence (AI) and machine learning (ML) have spurred the innovation of essential tools for identifying and preventing advertising fraud in all its forms. These technologies enable advertisers to detect and prevent fraudulent activities in real-time, ultimately saving them significant amounts of money and safeguarding their reputations in the process. By leveraging artificial intelligence and machine learning businesses can take control of their ad spend, ensuring that it's spent effectively and their ads are reaching real human audiences not just bots.


Real-time Bidding Strategy in Display Advertising: An Empirical Analysis

arXiv.org Artificial Intelligence

As one of the most striking advances in online advertising, real-time bidding (RTB) [3] has received increasing attention since it improves the efficiency and transparency of the ad ecosystem [4]. In RTB, the publishing media sells ad impressions through public auctions, and advertisers bid on their targeting ad impressions in real-time and pay for their winning impressions. Therefore, it requires the bidding agent to make accurate user feedback predictions for each ad impression and determine a reasonable bidding price to maximize the long-term revenue [5] of the ad campaign. Figure 1 illustrates the entire process of an advertiser participating in bidding for an ad impression. Initially, the advertiser registers an ad campaign on the Demand Side Platform (DSP) and specifies the campaign's budget as well as targeting rules for each ad delivery period (usually a day). Bidding agents running on DSP participate in RTB on behalf of advertisers.


Rise of the racist robots prompts artificial intelligence upgrades

#artificialintelligence

Businesses seeking to cut costs and satisfy consumer demands for lightning-fast service are increasingly replacing people with machines in every part of the supply chain. But regulators are becoming concerned that relying on artificial intelligence could undo decades of work put into making sure consumers are treated fairly regardless of their race or gender. Financial institutions have become the latest target in a wider crackdown on discrimination in artificial intelligence after the Bank of England launched a campaign to make sure new technologies don't restrict lending to ethinic minorities. The Bank's deputy governor David Ramsden held an industry roundtable with Mastercard, Visa, Capital One, Starling Bank, Experian, and others in October 2021 to discuss the ethics of artificial intelligence. Ramsden warned attendees about the elimination of "human judgement and oversight" in decisions.


Bid Optimization using Maximum Entropy Reinforcement Learning

arXiv.org Artificial Intelligence

Real-time bidding (RTB) has become a critical way of online advertising. In RTB, an advertiser can participate in bidding ad impressions to display its advertisements. The advertiser determines every impression's bidding price according to its bidding strategy. Therefore, a good bidding strategy can help advertisers improve cost efficiency. This paper focuses on optimizing a single advertiser's bidding strategy using reinforcement learning (RL) in RTB. Unfortunately, it is challenging to optimize the bidding strategy through RL at the granularity of impression due to the highly dynamic nature of the RTB environment. In this paper, we first utilize a widely accepted linear bidding function to compute every impression's base price and optimize it by a mutable adjustment factor derived from the RTB auction environment, to avoid optimizing every impression's bidding price directly. Specifically, we use the maximum entropy RL algorithm (Soft Actor-Critic) to optimize the adjustment factor generation policy at the impression-grained level. Finally, the empirical study on a public dataset demonstrates that the proposed bidding strategy has superior performance compared with the baselines.


How reinforcement learning chooses the ads you see

#artificialintelligence

Every day, digital advertisement agencies serve billions of ads on news websites, search engines, social media networks, video streaming websites, and other platforms. And they all want to answer the same question: Which of the many ads they have in their catalog is more likely to appeal to a certain viewer? Finding the right answer to this question can have a huge impact on revenue when you are dealing with hundreds of websites, thousands of ads, and millions of visitors. Fortunately (for the ad agencies, at least), reinforcement learning (RL), the branch of artificial intelligence that has become renowned for mastering board and video games, provides a solution. Reinforcement learning models seek to maximize rewards.


Unbiased Lift-based Bidding System

arXiv.org Machine Learning

Conventional bidding strategies for online display ad auction heavily relies on observed performance indicators such as clicks or conversions. A bidding strategy naively pursuing these easily observable metrics, however, fails to optimize the profitability of the advertisers. Rather, the bidding strategy that leads to the maximum revenue is a strategy pursuing the performance lift of showing ads to a specific user. Therefore, it is essential to predict the lift-effect of showing ads to each user on their target variables from observed log data. However, there is a difficulty in predicting the lift-effect, as the training data gathered by a past bidding strategy may have a strong bias towards the winning impressions. In this study, we develop Unbiased Lift-based Bidding System, which maximizes the advertisers' profit by accurately predicting the lift-effect from biased log data. Our system is the first to enable high-performing lift-based bidding strategy by theoretically alleviating the inherent bias in the log. Real-world, large-scale A/B testing successfully demonstrates the superiority and practicability of the proposed system.