We develop a practical optimization model for the management of a demand-side platform (DSP), which is applicable in static planning situations where the DSP acquires valuable impressions for high-volume advertiser clients in a real-time bidding environment. We propose a highly flexible model for the DSP to maximize its profit while maintaining acceptable levels of budget spending for its advertisers. Our model achieves flexibility and improved performance primarily through two different aspects: (i) we replace standard budget constraints with a more general budget utilization proxy function over budget spending levels, and (ii) we can accommodate arbitrary auction types by directly modeling the interactions between the DSP and the auctions. Our proposed formulation leads to a non-convex optimization problem due to the joint optimization over both impression allocation and bid price decisions. Using Fenchel duality theory, we obtain a convex dual problem that can be efficiently solved with subgradient based algorithms and from which a primal solution may be recovered efficiently. Under a natural and intuitive “increasing marginal cost” condition, as well as under a more general condition, we show that there is zero duality gap between the dual problem and the original non-convex primal problem. Under the same conditions, we also demonstrate convergence of our algorithm to an optimal solution of the non-convex formulation as the dual problem is solved to near optimality. We conduct experiments on both synthetic data as well as data from a real DSP, and our results demonstrate how our algorithm allows the DSP to better trade off between profitability and budget spending as compared to a widely used “greedy” heuristic approach.
Due to the tremendous growth of the internet and various digital mediums, online advertising is now a substantially sized industry, with an estimated $595 billion in global spending in 2023 (Statista, 2023). The online advertising environment offers advertising sellers the unique capability to deliver ads that are directly targeted to specific customer segments. Within targeted online advertising, a demand-side platform (DSP) serves as a type of intermediary between buyers and sellers of advertising inventory. A DSP typically manages the marketing campaigns of hundreds or thousands of advertisers, each of which simultaneously runs multiple campaigns. A campaign can be thought as of a plan for delivering advertisements, which defines a budget, pacing details, a target audience, and the ad to be shown. For example, Nike might run a campaign at half a million dollars per month that targets males between 18 and 35 years old who reside in California and visit sports websites. Advertisers measure the success of their campaigns in terms of the success of ad delivery across different market segments as well as the rate at which one or more desired actions occur, such as a click, conversion, or product purchase. In this article, we consider both standard pricing models in online advertising: (i) CPM (cost per thousand impressions), where advertisers pay for all impressions acquired, and (ii) CPA (cost per action), where advertisers pay the DSP only when this desired action occurs. This article is primarily concerned with the problem of optimally managing a portfolio of high-volume campaigns, under the CPM/CPA pricing models, across multiple advertisers within a DSP.
While DSPs interact with and manage the campaigns of advertisers interested in buying advertising inventory, those publishers interested in selling typically utilize a supply-side platform (SSP) to manage the sales of their inventory. An ad exchange then serves as a connection between DSPs and SSPs responsible for allocating impressions supplied by SSPs to campaigns managed by DSPs. An impression constitutes a single instance in which an ad is served—for example, a particular user browsing the homepage of ESPN—and contains information about the user as well as the publisher or domain in which the advertisement is to be shown. Once an ad exchange receives an impression opportunity, it initiates a real-time auction and all interested parties submit a single bid for the opportunity. Then, depending on the particular auction mechanism, the impression opportunity is allocated to a winner who is charged some amount possibly depending on all of the bids. This process is truly “real-time” and typically occurs in 100 milliseconds or less on average. Moreover, a large DSP may participate in hundreds of billions of auctions for different impressions opportunities per day.
As mentioned previously, when advertisers run campaigns they specify a budget that they desire to spend over a certain time horizon. In reality, a campaign will never exactly spend its budget amount and may underspend or overspend. Most existing algorithms in online advertising optimization, either through the use of budget constraints or heuristic approaches, heavily penalize overspending and place essentially zero penalty on underspending. In reality, underspending may be as undesirable or even more undesirable than overspending, and in this paper we consider a highly flexible budget utilization proxy function framework that allows campaigns to have greater control over their budget spending preferences.
Historically ad exchanges have usually used second-price auctions, also called Vickrey auctions, in which the highest bidder wins the auction and pays the amount of the second-highest bid submitted. Second-price auctions are truth revealing (Klemperer, 1999), that is, is optimal for a bidder to bid their truthful valuation, although this property does not hold in the case of repeated auctions or when bidders have limited budgets (Balseiro et al., 2015; Gummadi et al., 2012). As of the last few years, many ad exchanges including DoubleClick (Google), AppNexus, Rubicon, OpenX, and Rubicon have increasingly used first-price auctions to sell impression opportunities. In any case, whenever an incoming impression opportunity arrives, a DSP is tasked with determining how much to bid in the corresponding auction, and traditional approaches either bid truthfully or otherwise make an assumption that the auction is second-price. In contrast, in this article, we directly model the interactions with ad exchanges and thus we can accommodate arbitrary auction types.
In this article, we develop an optimization model and corresponding efficient algorithm to guide a DSP in determining how to bid for impressions and subsequently how to allocate acquired impressions to its portfolio of campaigns. Henceforth, we refer to “the DSP” as a generic demand-side platform implementing our proposed algorithm. (Our problem setup can also be applied for the case of one advertiser who manages several different campaigns with their own budgets across different ad-exchanges.) Our model is a static planning problem that produces a bidding and allocation policy that does not change dynamically over time throughout the course of the algorithm (it may change only when the model is re-solved). It is important to contrast such a static planning policy with a dynamic one that possibly adjusts its decision variables “on the fly” as the algorithm is implemented. Furthermore, our model assumes knowledge of the probability distributions governing future impression arrivals as well as the auction environment. Thus, our model is most advantageous when: (i) the distributions of supply and demand of impressions are stationary, that is, they are well-estimated based on historical data and applicable when projecting into the future, and (ii) a static bidding and allocation policy is adequate and there is not much value in adaptively modifying it based on actual realized outcomes. These two points may be considered implicit assumptions of our model, and we note that they are especially applicable when the DSP has a solid statistical model (referred to as a “bid landscape” model) of the ad exchange environment in which it operates, as well as a large enough pool of “high-volume” campaigns such that the DSP needs to make many decisions for each campaign.
We propose a formulation for the DSP to maximize its profit while ensuring that its advertisers are satisfied with the levels of budget spending across their campaigns. Due to the joint optimization over both bid price and impression allocation decisions, our model is non-convex. Nevertheless, we propose a dual formulation which is convex and can be efficiently solved with any subgradient based algorithm. Notably, under intuitive sufficient conditions, we demonstrate a theoretical result that is somewhat surprising given the generality of our model. Namely, we show that there is no duality gap between the original non-convex formulation and the convex dual problem, which provides stronger justification for recovering a primal strategy based on solving the convex dual. This strong theoretical guarantee is also validated by the empirical performance of our solution approach in the examples we consider.
Our contributions may be summarized as follows:
We propose a fixed time horizon static optimization model for simultaneously determining both bid price and impression allocation decisions. The decision variables imply a static solution strategy for the DSP that is not modified over time as new outcomes are realized. This static solution strategy is easily implementable and interpretable in practice. We assume a CPM/CPA pricing model, and its objective function allows the DSP to trade-off between maximizing its profit and maintaining acceptable levels of budget spending for its advertisers. Our model allows the DSP to participate in multiple ad-exchanges in order to acquire impressions for multiple advertiser clients across multiple campaigns.
We assume a stationary probabilistic model, referred to as the bid landscape model, of the uncertain dynamics of how the DSP interacts with the ad exchanges. The bid landscape model is not adjusted on the fly within our algorithm as the DSP gathers new information. Instead, the bid landscape model can be estimated directly from historical data (e.g., using machine learning). As a consequence, our modeling approach and results apply to arbitrary auction mechanisms, including both first and second-price auctions.
We propose the use of generic concave budget utilization proxy function over budget spending levels, such that the objective function of our optimization model is a combination of the profit of the DSP and the total budget utilization proxy values of its advertisers. This budget utilization proxy function approach is highly flexible, includes standard budget constraints as a special case, and allows each advertiser to specify their preferences on how both underspending and overspending should be penalized.
Our optimization model is non-convex due to the joint optimization over both bidding and allocation decisions. Using Fenchel duality theory (see, e.g., Boyd and Vandenberghe, 2004), we obtain a convex dual problem that can be efficiently solved with subgradient based algorithms—methods whose convergence rates and properties have been heavily studied (see, e.g., Nesterov, 2013; Parikh et al., 2014 and the references therein). We also show how to efficiently obtain a primal solution and a corresponding real-time policy for the DSP directly from the dual solution. Our overall primal-dual algorithm is very general, and makes no assumptions about the specific functional form of the budget utilization proxy function over budget spending levels or the bid landscape models.
Under an intuitive “increasing marginal cost” condition, as well as under a more general condition concerning the uniqueness of bid prices, we demonstrate that there is zero duality gap in our formulation. We also extend this result, under the same conditions, to show convergence of our primal-dual algorithm to an optimal solution of the original primal problem. These results hold under the broad assumptions of our model, in particular they hold with generic budget utilization proxy functions and with generic auctions/bid landscape models. They also highlight the primal-dual structure of our static optimization formulation, which may be useful for future extensions to dynamic setups and beyond.
We experimentally test our methodology on synthetic as well as a dataset from Criteo (Diemert et al., 2017) (a real DSP). Our experimental results demonstrate that our methodology allows the DSP to effectively manage the trade off between maximizing its profit and satisfying the budget spending preferences of its advertisers. The synthetic data experiment employs both first and second-price auctions, while the dataset from Criteo assumes that second-price auctions are used. We show that our methodology dominates a greedy heuristic that is popular in practice (and is optimal in the case of infinite budgets).
Related Literature
Optimization for online advertising platforms is a rich area in terms of the problems studied and methodologies utilized. Herein, we focus on reviewing those works that are closest to our DSP problem setting, our static optimization modeling approach, and the solution techniques that we utilize. Thus, this section is organized as follows. We first carefully review the literature on optimal bidding, including both single- and multiple-advertiser cases, as well as under different auction types. We then consider prior works on two-step bidding, or “pacing schemes,” which involve the use of duality to adjust bidding strategies. In reviewing these works, we distinguish between dynamic models that possibly update their decision strategies on the fly as new information is gathered and the outcomes of decisions are observed, and static models that do not. As we study a static optimization model for the DSP, in our expanded literature review (Appendix EC.1) we also review the rich literature on the broader use of static optimization modeling for online advertising planning. Additionally in our expanded literature review (Appendix EC.1), we discuss connections to recent works that consider other types of budget and return-on-investment (ROI) constraints.
It is worth emphasizing that our problem setting and proposed algorithm are situated in between two extremes. One extreme is where campaigns manage their own bidding manually, and the other, which is typically referred to as an autobidding system, is where the ad exchange also manages the bidding for the advertisers based on inputted budget values, targeting criteria, key performance indicator (KPI)/ROI goals, and so on (see, e.g., Aggarwal et al., 2019; Balseiro et al., 2021a; Deng et al., 2023 and the references therein). A DSP, on the other hand, helps advertisers manage their budgets, targeting considerations, and KPI goals. The DSP may be part of the same firm as the ad exchange or may be a third-party. In some instances, as studied here, the DSP also automates and manages the bidding for advertisers and, therefore, might be considered as an “intermediate autobidding system.” When considering an autobidding system (without intermediaries like a DSP), other considerations may come into play to optimize and manage the auctioneer platform. In fact, there is a body of work that studies how to optimally employ mechanism design techniques in order to improve welfare, revenue, and fairness results from an auctioneer point of view (Balseiro et al., 2021a, 2014) and/or prove market equilibrium results (Balseiro and Gur, 2019; Conitzer et al., 2022). These works may be distinguished by the types of assumptions made about the behavior and rationality of the bidders who participate in the different types of auctions. We revisit the relation to autobidding systems again when reviewing the connections to ROI constraints and other extensions.
Single advertiser bidding under second-price auctions
In general, on the bidder side, several works study how a single advertiser should bid in real-time auctions to maximize an objective (e.g., profit or a KPI metric like conversions or clicks), subject to budget and/or ROI constraints (Aggarwal et al., 2019; Balseiro et al., 2021a, 2022; Deng et al., 2023; Yang et al., 2019). Focusing first specifically on second-price auctions, Zhang et al. (2014) and Ren et al. (2018) both aim to solve the problem of maximizing the profit of an advertiser bidding in real-time auctions under a fixed budget. Both works propose a dual approach to tackle the problem, and Ren et al. (2018) offer an end-to-end system for how to bid and learn to estimate functions for the probability of winning and click under parametric assumptions. Balseiro and Gur (2019) also study the same scenario but in the context of repeated second-price auctions. They also derive a bidding strategy using duality techniques for the single advertiser case and prove that, in a market with many participants using these types of strategies, an approximate Nash equilibrium is achieved. In an online optimization setting, Balseiro et al. (2022) develop a broad class of mirror descent type algorithms for allocation problems and obtain regret bounds that apply to second-price auctions. An extension of the single-bidder second-price problem setup is a “multichannel” setup in which one advertiser is able to spend on several ad-exchanges (or channels). Along this research line, Avadhanula et al. (2021) propose a stochastic bandit with knapsacks model and establishes regret bounds. Using the truthfulness property of second-price auctions, some works derive optimal bidding strategies for the allocation problem. For example, Aggarwal et al. (2019) study different binary linear allocation problems representing typical advertisers’ preferences in the online advertising industry and derive optimal bidding strategies for those using Lagrangian duality. A similar setup is studied by Yang et al. (2019) who also uses duality for linear optimization to derive optimal bidding schemes. They propose to use their solution in a production environment by incorporating a proportional–integral–derivative (PID) controller.
Single advertiser bidding beyond second-price auctions
Bidding truthfully is optimal for second-price auctions (Vickrey, 1961). This truthfulness property allows many works (e.g., Aggarwal et al., 2019; Balseiro et al., 2022; Chen et al., 2011; Liu et al., 2017; Ren et al., 2018; Yang et al., 2019) to derive optimal bidding strategies without needing to estimate bid landscape functions. Bid landscape functions estimate the probability of winning an auction and the expected payment to the ad exchange depending on the submitted bid. Therefore, works assuming arbitrary auctions need to assume that the bid landscape functions were estimated previously in some fashion for static planning (e.g., Cui et al., 2011; Wang et al., 2016), or they need to assume either an explicit or implicit procedure to estimate these functions over time for dynamic strategies. As mentioned above, Zhang et al. (2014) develop a static planning procedure, based on functional optimization, for real-time bidding; they present results primarily in the second price case, although their methodology extends to bidding in arbitrary auctions as long as the appropriate bid landscape functions are defined. For first-price auctions, Han et al. (2025) establish regret bounds for the profit an advertiser can get in the infinite-budget case for both stationary and adversarial settings. They assume a setup in which only the highest bid is revealed to the bidders at each auction. Wang et al. (2023) extend the setup to include a maximum budget for the advertiser to spend on a finite amount of auctions and show regret results for both the budget spending and the profit obtained. The authors use a dual strategy in which the bidder performs a (dual) profit maximization step and a dual subgradient step at each iteration. This type of strategy was also utilized by Balseiro et al. (2020a) and Susan et al. (2023). Susan et al. (2023) study a multichannel problem in which channels may use first, second, or other auction types. They obtain optimality results for an “offline scenario” and regret bounds for their dynamic planning formulation. Micchi et al. (2020) also study the dynamic multi-channel problem, including a pacing procedure whose goal is to help the advertiser follow a spending “trajectory” set in advance (they do not include optimality results).
Two-step bidding schemes
In a manner similar to Wang et al. (2023), Balseiro et al. (2023), and Susan et al. (2023), our paper uses duality to derive a two-step solution. The first step can be thought of as a “pacing” scheme in which values are modified by a quantity depending on dual multipliers, typically to encourage “smoother” spreading of ads over time. In the second step, a solution to a best-response problem is found to make bidding decisions. The pacing part of the scheme adjusts valuations such that budgets are spent smoothly during the time horizon of bidding; early references discussing “pacing” for online advertising include Bhalgat et al. (2012), Lee et al. (2013), Xu et al. (2015), and the terminology is also employed by Balseiro and Gur (2019). The best response solution is needed to account for auction characteristics (payment rules and competition). Our work differentiates from these other works in three major ways. First, we allow for a flexible class of preferences over how advertisers’ budgets should be spent that relies on Fenchel duality. If budget utilization is valued high enough, our model even allows for the pacing part to increase the advertiser values which may lead to monetary losses for the DSP (which is not allowed in any of the other works). Second, we study a different problem in comparison to the other works; here a DSP is a centralized entity that aims to jointly maximize its own profit and the budget utilization proxy functions over the advertisers’ budget utilization. This centralized DSP setup requires to have both bidding and allocation variables, and not just bidding ones (allocation variables are not needed in a decentralized auctioneer setup). Third, similarly to Susan et al. (2023), we allow for generic auction types and obtain strong duality results. To elaborate, Susan et al. (2023) study what they refer to as “offline” and “dynamic” cases of their problem setting, and in the offline case specifically they prove optimality of their value-pacing strategy, which is more aligned with the zero-duality gap results we obtain in our static planning setting. As another contribution of our work, we demonstrate how to use our static planning optimization problem in a real operation by adding a primal-recovery step. When a Slater-type condition holds, this allows us to obtain algorithm convergence guarantees, which strengthen our theoretical results. Finally, we mention that Balseiro et al. (2023) also leverage the structure of two-step bidding schemes in a very different problem setting than ours. Namely, they examine repeated generic auctions under on average budget constraints and correlated advertisers’ values, and they demonstrate Bayes-Nash equilibrium and revenue equivalence results. Table 1 offers a detailed comparison of the setups, formulations and assumptions of various works using this two-step methodology; note that we distinguish between static planning settings like ours and dynamic settings where decisions can be adjusted over time throughout the course of implementation of the algorithm (e.g., by updating dual variables).
Comparison of papers using a “two-step” pacing and best response solution scheme.
Paper
Number of advertisers Problem setting Auction types Static or dynamic planning Theoretical results
Multiple advertisers Market equilibrium analysis of market with bidders using a pacing—best response scheme General auctions Dynamic Bayes-Nash equilibrium existence, price of anarchy, revenue equivalence.
This paper
Multiple advertisers Jointly maximize (DSP) profit and (advertiser) budget utilization proxy terms General auctions Static Strong duality results and algorithmic convergence guarantees.
Abbreviation: DSP = demand-side platform.
We use “general” when the paper includes at least first and second-price auctions.
Paper studies a general online resource allocation problem of which the problem mentioned above is one application.
Finally, there are a few related works studying a “centralized” DSP profit or value maximization subject to budget and/or ROI constraints for several advertisers. Chen et al. (2011) models the problem as a linear program in which they study the problem of allocating multiple impressions to multiple advertisers, each of which has constraints on their total budgets and valuations. They prove that the optimal bid for each advertiser corresponds to the dual of its budget constraint. Also, they show that their allocation procedure matches a first-price auction in which an impression is allocated only if it obtains a positive profit (the allocation scheme in our work follows this structure as well, but our optimal bids are the maximizers of more general dual profit expressions that depend on the auction mechanism used). They propose to use their strategy in a real operation by updating the advertisers’ bids using the ‘‘PI’’ component of a PID controller. The works of Grigas et al. (2017) and Liu et al. (2017) propose a more general setup with possibly arbitrary bid landscape functions that can represent non second-price auctions, although Grigas et al. (2017) do not discuss non-second-price auctions in their work. Liu et al. (2017) models the problem as a multidimensional knapsack problem allowing the advertiser to set budget and ROI constraints allowing arbitrary bid landscape functions as well. They prove strong duality results when restricted to second-price auctions.
Model Preliminaries
In this section, we describe the basic assumptions and notation of our model, as well as the definitions of decision variables as they relate to the operational decisions faced by the DSP.
Model Notation and Parameters
The set of campaigns managed by the DSP is denoted by , and the set of possible impression types is denoted by . For a given campaign , the set of impression types that campaign has elected to target is denoted by . Let denote the edges of an undirected bipartite graph between and , whereby there is an edge whenever campaign targets impression type , that is, . Let denote the set of campaigns that are interested in impressions of type . We sometimes slightly abuse the notation by referring to the campaign (resp. the impression type), whereby we implicitly assume a one-to-one correspondence between and . Our model depends on several different sets of parameters and functions, which are listed below.
Parameters and Functions:
= the maximum allowed bid for an impression of type (set either by the ad-exchange or the DSP).
= the target spending level (or budget) of campaign (set by the advertiser).
= the expected number of arrivals of impressions of type during the planning horizon.
= the expected revenue the DSP earns each time an ad of campaign is shown to an impression of type .
= the probability of winning an impression of type with a bid amount of .
= the expected amount that the DSP pays the ad exchange whenever the DSP wins an auction for an impression of type with a submitted bid of .
Some of these parameters are completely known to the DSP (such as the budget of each campaign), whereas others (such as the probability of winning an auction) may need to be estimated based on historical data or by other means. In Appendix EC.3, we provide more details on possible parameter estimation methods. Our model encompasses both CPM and CPA pricing models, and the interpretation of the quantity changes accordingly. In the CPM model, is a deterministic amount that campaign pays the DSP each time their ad is shown. In a CPA model, represents an expected payment value, depending proportionally on the probability that an action of interest (e.g., a click) occurs. For each , the functions and encompass all of the information needed for our model regarding the auctions for impressions of type . Note that takes as input a possible bid amount and returns a probability value, and takes as input a possible bid amount and returns a nonnegative expected payment amount. The functions and represent the DSP forecast regarding the bid landscape for impression type (see, e.g., Cui et al. 2011) and we refer to these functions as the bid landscape functions. Note that it is possible to have and/or . Although does not make much sense in practice, allows for the possibility that the DSP may lose the auction even if they bid the maximum amount . Examples 1 and 2 below discuss how two widely popular auction types – first and second-price auctions – affect the structure of the bid landscape functions.
(First-Price Auction)
In a first-price auction, the winner is the bidder who placed the highest bid and the winner is required to pay the bid that they submitted. This implies that the amount that the DSP is required to pay the ad exchange whenever the DSP wins an auction for an impression of type is deterministically equal to the amount that they bid, that is, . Moreover, whether the DSP wins the auction is completely determined by the highest competing bid, which we can model as a non-negative random variable . Using this notation, we have that (in the case of a tie, we assume that no one wins the auction). Note that, in the case of a first-price auction, there is no need to estimate as this function is completely determined and estimation of reduces to the problem of estimating the CDF of . First-price auctions have gained importance in practice as several prominent ad-exchanges including Google’s DoubleClick, OpenX, AppNexus, and Rubicon have adopted them.
(Second-Price Auction)
In a second-price auction, the winner is again the bidder who placed the highest bid, however now the winner is required to pay an amount equal to the bid of the second highest bidder. Again if is a random variable representing the highest competing bid, then we have that . Now the amount that the DSP is required to pay the ad exchange whenever the DSP wins an auction for an impression of type is a random variable with mean . Note that if is a continuous random variable with density , then it holds that . In the case of a second-price auction, the DSP needs to estimate both and , which both reduce to estimating properties of the distribution of . Second-price auctions are a common assumption in the literature on optimization related to DSPs and ad-exchanges, such as by Zhang et al. (2014) and Richardson et al. (2007), as they were the industry standard until 2017 when several ad-exchanges started to transition from second to first-price auctions. Second-price auctions have a desirable truth-revealing property that makes it optimal for advertisers to bid their real valuations. However, it is important to emphasize that this truth revealing property is no longer valid when bidders have budget constraints and/or in the case of repeated auctions.
Decision Variables and Additional Notation
In this section, we describe the decision variables of our model and describe some additional notation and quantities of interest. Let us first review the basic flow of events in the model. When an impression of type is submitted to the ad exchange, a real-time auction is held for which the DSP has an opportunity to bid. Thus the DSP has an opportunity to make two operational decisions related to each real-time auction: (i) how to select a campaign to bid on behalf of in the auction, and (ii) how to set the corresponding bid amount . If the DSP wins the auction on behalf of campaign , then the DSP must pay a certain amount to the ad-exchange, an ad from campaign is displayed, and in expectation the DSP earns revenue equal to . We define decision variable to model these two operational decisions as follows.
Decision Variables:
= the probability of bidding on behalf of campaign when an impression of type arrives.
= the corresponding bid amount submitted for an impression of type on behalf of campaign .
Interpreted differently, represents a proportional allocation, that is, the long run average fraction of impressions of type that are allocated to campaign . In this application domain, proportional allocation variables are a reasonable modeling decision, as opposed to integer variables, due to the vast number of impressions that arrive during the planning horizon. Note that represents the bid that the DSP submits to an auction for impression type conditional on the fact that the DSP has selected campaign for the auction. Related approaches (e.g., as by Chen et al., 2011) also use bids to rank advertisers—in our approach, the selection of which campaign to bid on behalf of is completely captured by the decision variables and thus the decision variables only determine the actual bid price decisions. Let denote vectors of these quantities, which will represent decision variables in our model. We defer the discussion of the static optimization problem (1) used to solve for and to Section 3, as developing and studying the properties of problem (1) constitutes a major part of our contribution. For convenience, we list all of the previously defined sets, parameters, functions, and decision variables in Table EC.1 of the E-Companion Appendix.
Given values of and , recall that is the probability that a new impression of type is allocated to campaign , is the probability that the auction is won, and is expected amount that campaign spends in the event that the auction is won. Therefore, for all , we define , which may be interpreted as the expected amount that campaign spends on impressions of type during the time horizon as a function of and . Furthermore, in the event that the auction is won, is the expected amount that the DSP pays to the ad exchange. Therefore, for all , we similarly define , which may be interpreted as the expected profit that the DSP earns from campaign for impressions of type . Indeed, note that is the expected profit the DSP earns when the DSP bids an amount for an impression of type on behalf of campaign . Presuming that the amount that the DSP earns per bid is independent of the number of bids, Wald’s equation implies that the total expected profit that the DSP earns from campaign for impressions of type is . A similar argument verifies the validity of the definition of .
To further simplify notation, define and as subvectors of the allocation vector and the bidding vector , respectively, consisting of the variables associated with impression type . Let denote a vector of the expected spending values associated with impression type , whereby the component of is equal to if and 0 otherwise. Also, let denote a vector of expected total spending values, whereby . Likewise, let denote the total expected profit earned from impressions of type and let denote the overall total expected profit.
We use boldfaced font, for example, , , and (which denotes the vector of all 0s) to denote vectors, regular font, for example, and to denote components of vectors, and calligraphic font, for example, and , to denote sets. For a given set and function , let denote the (possibly empty) set of maximizers of the function over the set ; likewise is defined similarly. If is an extended real valued function, then we let denote its domain. Note that when is convex then , and when is concave then . Furthermore, when is convex, for a given , denotes the set of subgradients of at , that is, the set of vectors such that for all . For any function , the convex conjugate of , denoted by , is defined by . For a given finite set , let denote its cardinality. For a given set , let denote its interior, let denote the closure of the convex hull of . For define , and for with define . Finally, we use to denote a scalar derivative in the right context.
Optimization Formulation: Primal and Dual Problems
In this section, we introduce our main optimization problem of interest, which models the DSP’s goal of maximizing profit while maintaining a desired spending level of its advertisers’ budgets, as well as our corresponding two phase primal-dual procedure. A central ingredient of our proposed formulation is the use of a budget utilization proxy function to model the budget spending preferences of the advertisers. Formally, the budget utilization proxy function is an extended real valued concave function that receives the vector of the expected spending of the campaigns as its input and outputs a value quantifying how satisfied the advertisers are with the spending levels .
Intuitively, the budget utilization proxy function can be thought of as a negative penalty function associated with deviations of away from the target spending levels across all campaigns. Indeed, each campaign would be “maximally satisfied” with spending an amount exactly equal to their target value, that is, . In this case, their expected budget utilization is exactly 100%. However, achieving exact target levels of spending, that is, 100% budget utilization across all campaigns, may not be practical or even feasible for the DSP. Instead, each campaign usually allows a degree of flexibility around their target value . Furthermore, each campaign might have their own predisposition regarding how to penalize deviations away from their spending level . For instance, some campaigns may view the target level as an upper bound and be equally satisfied with any performance level as long as their expected spending is below (Example 3). More realistically, a campaign with a hard upper bound may still want to achieve as large of a budget utilization as possible while not exceeding their upper bound . In this case, encouraging larger values of budget utilization can be achieved by penalizing deviations of expected spending levels away from using, for example, a quadratic penalty function (Example 5). Or, in an even simpler scenario, a campaign may simply view as a more ambiguous target value and may allow deviations in expected spending both above and below , as long as these deviations are encouraged to be “small enough” (Example 4). Finally, a campaign may be “equally maximally satisifed” with all expected spending levels on a bounded interval (Example 6). It is important to note that these are only a handful of examples, and our modeling framework is quite flexible, allowing the advertisers to specify their proxy function in a “hard” and/or “soft” constrained way, with generic shapes and rates of penalization away from the target spending levels . We also allow, potentially, for interactions across the contribution of different campaigns to the proxy function. In the most general case, we allow for an arbitrary budget utilization proxy function satisfying some mild technical assumptions, which are collected below in Assumption 1.
In the remainder of this section, we first present our optimization model that incorporates the budget utilization proxy function and balances it with expected profit of the DSP. After defining the model and introducing its dual, which utilizes the convex conjugate of the negative of the proxy function , we further developed the examples discussed above and give precise mathematical formulas of the proxy functions and their conjugates in Section 3.1. We then derive further properties of the dual function in Section 3.2 and present our two-phase primal-dual procedure in Section 3.3.
The advertisers’ budget utilization proxy function is a concave function satisfying:
is a non-empty closed set, and
is continuous on .
Notice that part (i) of Assumption 1 implies that is proper (meaning that for at least one ) and closed (meaning that the hypograph of is a closed set). In addition, throughout the rest of the paper we assume that the bid landscape functions are also continuous (Assumption 2).
For each , the bid landscape functions and are continuous on .
Notice that Assumption 2 implies that and are continuous in on the feasible domain of . Strictly speaking, Assumption 2 is not needed to develop the primal-dual procedure in this section. However, Assumption 2 simplifies the discussion and is needed for the theoretical results of Section 4.
Problem (1) presented below is our main optimization problem of interest, which maximizes the sum of the total expected profit of the DSP and the total budget utilization proxy values of the advertisers, subject to feasibility constraints on the allocation variables and the bidding variables . Thus problem (1) directly trades-off between profitability for the DSP and the budget spending preferences of the advertisers, and in Section 3.1, we present several examples of budget utilization proxy functions that highlight the modeling flexibility of our framework.
Our high level approach for solving problem (1) is based on a two-phase procedure. In the first phase, we construct a suitable dual of (1), which turns out to be a convex optimization problem that can be efficiently solved with subgradient based algorithms. A solution of the dual problem naturally suggests a way to set the bid prices . In the second phase, we set using the previously computed dual solution and then solve a convex optimization problem that results when is fixed in order to recover allocation probabilities .
Towards developing our two-phase procedure, let us start by deriving the dual problem of (1). For each , define the feasible set of by , the feasible set of by , and the feasible set of by . Then the feasible region of (1), denoted by , decomposes as , where is the feasible set of and is the feasible set of . In a slight abuse of notation, we may also remark that decomposes across the impression types as . Using the notation defined in Section 2.2, in particular the definition of the total profit and the vector of expected spending values , we can more compactly write problem (1) as follows:
Let denote the set of feasible expected spending values, and note that is a compact set since is compact and is a continuous function. We say that problem (2) is feasible whenever is non-empty. Proposition 1 justifies the use of the max (instead of sup) in (2) above by demonstrating that is finite and attained whenever problem (2) is feasible.
Suppose that problem (2) is feasible, that is, it holds that . Then, the optimal value of (2) is finite and attained.
The proof of Proposition 1 as well as all omitted proofs for this section are contained in Appendix EC.5.1. Notice that problem (1) is quite general and possibly non-convex due to the generic form of and as well as the multiplications of these functions with the variables . On the other hand, the concavity of the budget utilization proxy function enables us to construct a dual problem that is always convex and amenable to efficient solution methods. Since is a convex function, it is useful to consider its convex conjugate . Recall that is defined by for any . Since is a proper and closed convex function, the Fenchel-Moreau Theorem (see, e.g., Boyd and Vandenberghe, 2004: 91) ensures that and hence
Now we can simply substitute the above equality into (2) to obtain the following equation:
Let us define the function by the following equation:
The Weierstrass theorem ensures that and that the maximum is attained above. Note also that is a convex function since it is a maximum of linear functions of . Furthermore, is non-negative everywhere since , and and . Recalling the notation defined in Section 2.2, for each , let us also define the function by , and likewise note that , is convex, and . Then it holds that the following equation:
hence is decomposable across the impression types.
Furthermore, let denote the dual function, which is also convex. Then the dual problem of (1), which is obtained by exchanging the supremum and infimum in (3), is
which is a convex optimization problem. Notice that the dual objective function decomposes as the sum of two terms: (i) which depends on the particular form of the functions and other problem parameters but does not depend on the budget utilization proxy function , and (ii) which directly depends on the budget utilization proxy function as it is the convex conjugate of . Recall the definition of the primal function . Basic weak duality immediately yields as follows:
Also, since is proper by Assumption 1, we have that is proper (i.e., is non-empty) and, therefore, is finite (although possibly not attained) whenever (1) is feasible. Later, in Section 4, we study conditions under which there is zero duality gap above, that is, it holds that .
Examples of Budget Utilization Proxy Functions
Let us now give several examples of budget utilization proxy functions to demonstrate the modeling flexibility of our approach. For each example, we define the budget utilization proxy function and discuss the conjugate function that appears in the dual problem (6). All of our examples are separable across the different campaigns, that is, it holds that where, for each campaign , is a scalar concave function (that also satisfies Assumption 1) representing the budget utilization proxy value of campaign as a function of its expected spending level . This separable structure is quite natural as we do not expect the spending of other advertisers to affect a given advertiser’s budget utilization and it allows each campaign to use their own preferred type of budget utilization proxy function. However, this is not a requirement as it is also possible for our framework to accommodate more general concave budget utilization proxy functions that do incorporate such interactions. Note that the separable structure of also implies that is similarly separable, that is, it holds that where is the convex conjugate of . In the remainder of this section, we present examples of scalar budget utilization proxy functions , as well as each corresponding conjugate function . The derivations of the corresponding conjugate functions may be found in Appendix EC.6.1.
(Budget Constraint)
Consider the case where campaign is only concerned with ensuring that its expected spending level does not exceed the budget value . This is the well-studied case of a budget constraint, which has been examined in several different contexts in the literature including, for example, by Balseiro et al. (2014), Chen et al. (2011), Lee et al. (2013), Zhang et al. (2014), as well as Grigas et al. (2017). In this case, the budget utilization proxy function and conjugate function pair is:
(Target Spending)
Notice that the previous example does not directly encourage the DSP to spend the budget of campaign and may in fact lead to budget spending values that are considerably smaller than the target budget value . As mentioned previously, poor or insufficient budget spending may motivate advertising companies to stop working with the DSP and thus hurt the DSP’s business in the long run. We can address this issue in our model by considering a budget utilization proxy function that directly encourages the expected spending level of campaign to be close to the target budget value . One such budget utilization proxy function and conjugate function pair that uses a squared quadratic penalty, for a given parameter , is:
(Budget Constraint and Target Spending)
We can combine Examples 3 and 4 in order to encourage spending at, but not more than, the target value :
(Budget Constraint and Hard Minimum Spending)
A stricter way to guarantee a minimum budget spending is to require that campaign spends at least a fixed percentage of, and no more than, its target value . In this case, for a given parameter , the budget utilization proxy function and conjugate function pair is:
Note that can be equivalently expressed as .
If the budget utilization proxy functions for each campaign are constructed from Examples 3, 4, or 5, then problem (1) is guaranteed to be feasible since . If Example 6 is used for one of the campaigns, then problem (1) may or may not be feasible depending on the choices of and .
Further Properties of the Dual Function
Let us now describe some additional properties of the dual function that provide further intuition and will also be useful in our two phase primal-dual procedure described in Section 3.3. Herein we demonstrate how to efficiently compute the function value as well as a subgradient of at any given . It turns out that these computations rely on the ability to solve a simple one-dimensional optimization problem to obtain the bidding variables, which we formalize below as an assumption.
For each , define , which represents the expected profit that the DSP earns each time it bids an amount for an impression of type , given that the expected revenue earned whenever the corresponding ad is displayed is equal to . Then, for all values of , we can efficiently compute an optimal bid price .
It is important to note that Assumption 3 may not be applicable in all scenarios, for example, in a complex auction type where there is no closed form solution for the optimal bid price and where it may be computationally expensive to approximate it. On the other hand, if the structure of the auction is well understood, then can be simply determined in closed form, as demonstrated by the following two examples. In these cases, Assumption 3 is readily satisfied. Formal derivations of for these as well as additional examples are included in Appendix EC.6.2.
(Second-Price Auction cont.)
Consider the case of a second-price auction for impressions of type , as in Example 2. It is well-known that bidding truthfully is optimal for second-price auctions (Vickrey, 1961). In our context, truthful bidding implies that, for any , , where recall that . Importantly, this simple expression for does not depend on the specific forms of or due to the special properties of second-price auctions.
(First-Price Auction cont.)
Consider the case of a first-price auction for impressions of type , as in Example 1. In this case, we have and will depend on the functional form of . As a simple example, consider the case where the highest competing bid is equal to the maximum of i.i.d. random variables uniformly distributed on . Then, it holds that .
Given the separability property (5) of , whereby , in order to compute and a subgradient of at any given , by additivity of subgradients it suffices to demonstrate that we can efficiently compute as well as a subgradient of each of the individual functions at (which may be done in parallel). Recall that is defined by a certain maximization problem with respect to the variables , whereby . Hence, in order to compute a subgradient of at , it suffices to solve this maximization problem, which we now equivalently rewrite using the collection of functions parameterized by defined in Assumption 3. Indeed, observe that:
The above demonstrates that, for any given dual vector , computing is equivalent to finding the allocation and bidding vectors for impression type that maximize the profit of the DSP, but where the expected revenue quantities are each scaled by . Hence, the dual variables provide a natural mechanism to account for the budget spending preferences of the campaigns within the dual functions , whereby there is “extra spending” on campaign if , “normal spending” if , “reduced spending” if , and “no spending” if . The derivation in (7) also suggests a natural algorithm to compute and a corresponding subgradient: first maximize the scalar function with respect to for each , then chose the campaign with the largest corresponding optimal value. Algorithm 1 below formalizes this algorithm for computing solving the maximization problem in (7) as well as the corresponding subgradient . Proposition 2 formally demonstrates that the output of Algorithm 1 is valid. Note that the computational complexity of Algorithm 1 is if one presumes that the bid prices in Step (1.) can be computed in time.
For each and for all , the output and of Algorithm 1 satisfies and .
Two-Phase Primal-Dual Procedure
We now describe in detail the two-phase primal-dual procedure, in which we first solve the dual problem (6) to near optimality, and then use the dual variables to recover the allocation and bidding variables by solving an auxiliary convex optimization problem. Specifically, we first solve for (near) optimal dual variables , use these dual variables to construct biding variables via Assumption 3, and then solve for the allocation variables by solving a restricted version of problem (1) that results after fixing the bidding variables at . For a given fixed vector , this restricted problem that maximizes the objective function of (1) with respect to the allocation variables is presented by the following equation:
Note that (8) is a concave maximization problem (hence a convex optimization problem) since, for fixed , and are both linear functions in (which implies that is a concave function). Let denote the set of feasible allocation vectors in (8) and let denote the set of possible expected spending values given . The following proposition demonstrates that if problem (8) is feasible, then it is also finite and attained.
Suppose that problem (8) is feasible, i.e., it holds that . Then, the optimal value of (8) is finite and attained at some .
The proof of Proposition 3 follows that of Proposition 1 and is omitted for brevity. For many choices of budget utilization proxy functions, such as Examples 3–5 from Section 3.1, the feasibility condition is guaranteed to hold for all since, in these examples, we have that for all .
Algorithm 2 presents our two-phase primal-dual solution procedure. Note that both phases involve solving a convex optimization problem. Phase 1 involves solving the dual problem (6). Due to the generic form of the bid landscape functions and other problem parameters, we generally need to treat the dual problem in a black-box format that can be tackled efficiently with subgradient based algorithms, the simplest of which is the projected subgradient descent method (see Section 5.1 for details), using Algorithm 1 as a subroutine to compute subgradients of . On the other hand, problem (8), which is solved in Phase 2, is often highly structured depending on the choice of budget utilization proxy function. For example, any combination of the four examples given in Section 3.1 results in a convex problem with a quadratic objective and linear constraints (and a possibly a linear objective if only Examples 3 and 6 are used), which is a highly structured problem that can be tackled with commercial solvers such as Gurobi (Gurobi Optimization, 2019).
Zero Duality Gap Results
In Section 3, we introduced our primal optimization problem of interest (1) that balances profitability for the DSP with satisfying the advertisers’ budget spending goals, its dual problem (6), as well as a two-phase primal-dual approach for solving (1), Algorithm 2. Despite the non-convexity of problem (1), we demonstrate that certain conditions on the problem parameters (all defined in Section 2) imply that: (i) there is zero duality gap between the primal problem (1) and its dual (6), and (ii) Algorithm 2 recovers an optimal solution of the primal problem (1). In particular, we define a type of “increasing marginal cost” condition on the bid landscape functions and that, when satisfied by all impression types , provides a sufficient condition for the zero duality gap result. This condition is formally defined below in Definition 1.
(Increasing Marginal Cost (IMC) Condition)
Impression type satisfies the IMC condition if the bid landscape functions and are differentiable on and satisfy:
for all .
is strictly increasing on the interval , where is the expected cost associated with impression type .
The function value , defined as part of Definition 1, may be interpreted as the “expected marginal cost” of winning another auction for impressions of type when bidding an amount . Indeed, represents the rate of change of the expected cost function at , whereas , which is assumed positive, represents the rate of change of the probability of winning the auction at . Hence, the ratio may be interpreted as the “change in expected cost per bid” divided by the “change in quantity of auctions won per bid,” that is, the marginal cost of winning another auction. The IMC condition then simply states that this “expected marginal cost” function is increasing. In Section 4.1, we discuss several examples of popular and important cases, including first and second-price auctions, that satisfy the IMC condition. It also turns out that the IMC condition is a special case of a more general condition which states that, for all values of the expected revenue term , there is a unique optimal bid price that maximizes the expected profit function . This condition also implies the zero duality gap result, and is stated formally in Definition 2 below (Lemma EC.2 in the E-Companion Appendix proves that condition IMC is a special case of condition UBP defined below).
(Unique Bid Price (UBP) Condition)
Impression type satisfies the UBP condition if, for all , the expected profit function has a unique maximizer on , that is, it holds that .
Theorem 1 below presents our main zero duality gap result under either the IMC or UBC conditions. Recall that we defined feasibility of problem (1) by the condition , where is the set of feasible expected spending values. The statement of Theorem 1 strengthens this feasibility condition to a version of Slater’s condition for our problem, which states that and ensures the existence of a dual optimal solution . Indeed, this condition is correctly interpreted as a version of Slater’s condition since is equivalent to the existence of a feasible point such that . Note that Slater’s condition always holds for the budget utilization proxy functions in Examples 3, 4, and 5 from Section 3.1 since in all three cases we have . For Example 6, Slater’s condition may not always hold but such cases are inherently degenerate.
Suppose that problem (1) satisfies Slater’s condition and that, for all impression types , either the IMC condition (Definition 1) or the more general UBP condition (Definition 2) holds. Then, the following statements hold:
There exists an optimal solution of the dual problem (6).
There is zero duality gap between the primal problem (1) and its dual (6), that is, it holds that .
When both the Phase 1 and Phase 2 subproblems of Algorithm 2 are solved to exact optimality, then Algorithm 2 returns an optimal solution of the primal problem (1).
The proof of Theorem 1 is contained in Section EC.5.2 of the E-Companion Appendix. Appendix EC.6.3 also contains examples showing that the conclusions of Theorem 1 may not hold if one or more of the assumptions are violated. Notice that the IMC and UBP conditions depend on each impression type independently; in particular they only depend on the properties of the bid landscape functions , on the interval . Thus, Theorem 1 may hold even in cases when different auction types and/or structurally different bid landscape functions are used for different impression types. Under the stated conditions, in addition to demonstrating the existence of a dual optimal solution as well as zero duality gap, Theorem 1 demonstrates that an optimal solution of the original problem of interest (1) may be efficiently computed using Algorithm 2. In particular, as long as the stated conditions of Theorem 1 hold and as long as the optimization problems in the two phases of the algorithm are solved to exact optimality, then Algorithm 2 returns an optimal solution of the primal problem (1). It is worth pointing out that the proof of Theorem 1 reveals that, under the stated conditions, there also exists a particular optimal solution with the property that for all . Although this is the same computation that is performed by Algorithm 1 when given input , note that the output of Algorithm 1 is not in general unique. Therefore, Algorithm 1 with input is not guaranteed to return a primal optimal solution and the primal recovery phase of Algorithm 2 is necessary. In Section EC.4, we extend item (3.) of Theorem 1 to demonstrate that Algorithm 2 returns an approximately optimal primal solution when the optimization problems in the two phases of the algorithm are solved only to approximate optimality. An extension of the analysis in this subsection to demonstrate algorithmic convergence of our primal-dual algorithm to an optimal solution of the original primal problem is presented in Appendix EC.4.
Examples Satisfying the IMC Condition
In this section, we give several examples of bid landscape functions, arising from auction types that are widely used in practice, that satisfy the IMC condition. Hence, if any combination of these bid landscape functions are used, and if the mild Slater’s condition holds, then Theorem 1 may be applied. The proofs of Propositions 4 and 5 are contained in Appendix EC.5.3.
(Second-Price Auction cont.)
Suppose that impression type corresponds to a second-price auction, as in Examples 2 and 7, and let be a non-negative random variable representing the highest competing bid. If is a continuous random variable with probability density function satisfying for all , then and satisfy the IMC condition.
(First-Price Auction cont.)
Suppose that impression type corresponds to a first-price auction, as in Examples 1 and 8, with for all . If is differentiable, strictly increasing, and concave on , then and satisfy the IMC condition (several examples of functional forms of satisfying the previous conditions are given in Appendix EC.6.2). Alternatively, if is the CDF of the maximum of i.i.d. uniform random variables on the interval with , then and satisfy the IMC condition.
It is worth mentioning that the IMC condition, as well as results relying on the IMC condition such as Propositions 4 and 5, may be extended to allow for a fixed known reserve price . In such a case, a bid in an auction for impression type is only considered valid if it is above and the DSP has complete knowledge of the value . With a reserve price, the rules corresponding to first and second-price auctions remain the same but only valid bids above are considered. For simplicity, we assume that in Definition 1 and Propositions 4 and 5, but an easy modification allows for the case of .
Numerical Experiments
In this section, we present results wherein we applied Algorithm 2 (our two-phase solution procedure) on both synthetic and real data examples. For the real data case, we used bidding logs from Criteo (Diemert et al., 2017). Our synthetic experiments simulate scenarios using first and second-price auctions, while the Criteo experiment assumes that impressions are sold using second-price auctions. Our experiments demonstrate the following results. First, we can obtain a family of solutions that trade off DSP profitability for higher budget spending rates for the DSP campaigns. This trade-off is obtained by using different budget utilization proxy functions with different parameters. Second, our methodology outperforms a natural baseline that is optimal when budgets go to infinity as well as a natural modification of this baseline that also trades off between profitability and budget spending rates in a more heuristic manner.
The remainder of this section is organized as follows. In Section 5.1, we describe the implementation details and the evaluation procedure for our experiments. In Sections 5.2 and 5.3, we describe the experimental settings and results for the synthetic and real data experiments, respectively. In our experiments, we use the budget utilization proxy functions described in Examples 3–5. For ease of exposition, we will refer to them as UPF 3, UPF 4, and UPF 5, respectively, in this section. Finally, please refer to Appendix EC.7.4 for additional results for both the synthetic and real data based experiments.
Experimental Setup
At a high level, we first run Algorithm 2 during a training phase and then implement a testing phase to evaluate the performance of the real-time policy implied by our optimization method. In the training phase, to find an approximately optimal dual solution for Phase 1 of Algorithm 2 we use the basic projected subgradient descent method, which has the form and is the Euclidean projection operator to the domain of . This projection operator can be removed for UPF 4 and UPF 5, while for UPF 3 the operator function is simply , the element-wise maximum between the vector and zero. We use step sizes of the form , where the constant was selected by running Algorithm 2 for 5,000 iterations using and selecting the one with highest primal value . The candidate dual solution in Algorithm 2 is taken as .
The testing phase consists of simulating the operational environment of the DSP and evaluating the performance of one or more bidding and allocation policies. In particular, we are primarily interested in evaluating the performance of the real-time policy implied by the optimization solution generated by Algorithm 2 during the training phase. In these experiments, we consider the CPA model where the revenue amount earned from campaign is a random variable, which is equal to if the action of interest (such as a click) occurs and which happens with probability . In this case, campaign is charged an amount by the DSP, which results in the budget of campaign being depleted by while the DSP earns a revenue of . The full description of the CPA model and the real-time policy implied by our optimization solution is given in Appendix EC.3.2. Note that, in general, there may be a discrepancy between the parameters used in the training and testing phases. Additionally, the order in which different impression types arrive in the testing phase is not known during the training phase. In our synthetic experiments, we assume that there are no errors in parameter estimation, that is, all of the parameters are the same in both the training and testing phases. In the real data experiments, the conversion probabilities may be different in the training and testing phases, but all other parameters are the same. Full details of our simulation and testing procedures are given in Appendix EC.7.1.
A major characteristic of our proposed methodology is that it efficiently and effectively trades off between profitability for the DSP and the level of budget spending for its campaigns. Thus, we present empirical results that explore this trade off. In particular, we consider two performance metrics herein: (i) total profit, and (ii) budget utilization. The total profit is simply the sum of all profits earned by the DSP over the entire course of the simulation. Budget utilization is defined as the ratio of the total amount spent by all campaigns (empirically observed at the end of the simulation) divided by , the sum of all target budget values. For both synthetic and real data-based experiments, we present “Pareto frontier-like” graphs that show achievable ranges for these two performance metrics. The Pareto frontier-like analysis for the synthetic experiments is included in Appendix EC.7.4.1. In particular, for UPF 4 and 5, we obtain non-trivial graphs by varying the penalization parameter parameter used in the definition of these budget utilization proxy functions. We compare different configurations of our algorithm, that is, different budget utilization proxy function choices and varying values of , against a family of baseline policies referred to as the generalized greedy policy shown in Algorithm 3. For all policies considered, we run 100 simulations to achieve better statistical significance. To make the results more comparable, on a particular experiment, each of the 100 runs use the same data and random number seeds for all policies (i.e., all configurations of our algorithm and of Algorithm 3) considered.
Algorithm 3 is optimal in the case of with infinite budgets (in the model with budget constraints, that is, UPF 3) since it chooses the campaign and bid that maximizes the expected profit for each impression. Put another way, Algorithm 3 with is optimal when . In general, the “pure” greedy policy, that is, Algorithm 3 with , may not lead to adequate spending levels for campaigns. Thus, provides a simple mechanism to promote alternative rates of spending. In particular, choices of represent a simple and naive way of promoting under-spending and choices of represent a simple and naive way to spend aggressively. As we vary the parameter we also obtain non-trivial graphs in the space of observed total profit and budget utilization values, just as we do when we vary the penalization parameter for UPF 4 and UPF 5.
Before presenting our results, let us make a few more remarks with respect to our general experimental scheme. First, in all of our experiments, we assume that each campaign cannot spend more than its target budget . A campaign will stop spending once its remaining budget is less than (the price it would pay for the next click or action of interest). Finally, Algorithm 3 adjusts its strategy as campaigns completely deplete their budgets, which suggests it may have a possible advantage over our methodology which does not. The policy implied by our solution can be easily adapted to adjust its strategy in the same way. In order to be more faithful to the original interpretation of the variables , we do not perform this adjustment, however, we expect it to improve the performance of our methodology, and therefore the results presented herein represent a conservative lower bound on the performance of a more practical version.
Synthetic Experiments
In this section, we discuss the results of two synthetic experiments. In the first experiment, we test how our proposed methodology performs when we fix all parameters, except for the target budget values . In the second experiment, we consider the trade off between profitability and budget utilization by fixing all parameters except for the penalization parameter in UPF 4 and UPF 5 as well as the parameter in Algorithm 3. Both experiments use with for all and for all . Each of the of the 100 runs per configuration is composed of impression arrivals, and the type of each impression arrival is obtained by sampling uniformly from the set . The highest competing bid for each impression arrival of type is obtained by sampling from a maximum of i.i.d. random variables, where is a random integer value defined in Appendix EC.7.2. Note that, given the value of , in the cases of both first and second-price auctions, we obtain closed formulas for the bid landscape functions and the optimal bidding function as described in Appendix EC.6.2. For brevity, we defer the remainder of the discussion of how the impression-campaign graph, the bid landscapes, the conversion probabilities, and the revenue terms are generated to Appendix EC.7.2.
Sensitivity With Respect to Budgets
In this experiment, we assume that every campaign has an identical target budget value and we vary these values in the range . We compare our methodology using UPF 3, UPF 4, and UPF 5 against the “standard” greedy heuristic, that is, Algorithm 3 with . We used as the penalization parameters in UPF 4, and 5 for all . Any choice of could have been used, but the choice of places the budget utilization proxy function term in the objective on roughly the same scale as the DSP’s profit term. To verify that the range of budget values explored is adequate, we plot the win-rate of the DSP (percent of auctions won) versus the budget and observe diminishing returns, that is, that the market share of the DSP stabilizes and hence the budget values become non-binding as we approach a budget value of 300 per campaign. This plot is included in Appendix EC.7.4.
Figure 1 shows the observed budget utilization, profit, and relative profit values of our methodology as compared to the standard greedy heuristic (Algorithm 3 with ) for both first and second-price auctions. Note that the relative profit is calculated with respect to the standard greedy heuristic. The plots in Figure 1 demonstrate that, for small budget values, our methodology more than doubles the profit of the standard greedy heuristic without losing much in budget utilization. In fact, for second-price auctions UPF 4 strictly dominates the greedy heuristic (UPF 4 has both larger budget utilization and larger profit than the greedy heuristic), and UPF 5 does as well for budget values above 100 or so. In the case of first-price auctions, UPF 4 and 5 dominate the greedy heuristic for small to medium size budget values, while for high budget values they trade profitability for higher budget utilization. UPF 3 consistently obtains more profit than the greedy heuristic, but sacrifices budget utilization. Generally speaking, the budget utilization of the greedy heuristic and UPF 4 are the same for small budget values as they spend the advertisers’ budgets entirely. UPF 5 in general is not able to spend the advertisers budgets fully since using as hard a constraint does not give enough slack to account for the randomness of the actual impression arrivals.
Plots showing the observed profit and budget utilization values of our proposed formulations and the standard greedy heuristic (Algorithm 3 with ), versus the target budget values. The ‘‘Relative Profit’’ plots are with respect to the standard greedy heuristic. The first row shows the results assuming first-price auctions, and the second row shows the results assuming second-price auctions.
The plots in Figure 1 demonstrate that profit is typically a concave function with respect to the target budget values . They also show that the relative profit decreases faster for first-price auctions in comparison to second-price auctions, as is increased. This makes sense as the cost that the DSP pays for each impression is higher for first-price auctions. Also, notice that—again in the first-price case—for high budget values UPF 4 and UPF 5 obtain worse profitability than the standard greedy heuristic, but better budget utilization. This occurs since, when we increase the budget, the penalization part in the objective function of problem 2 becomes more important, which leads to the DSP bidding higher overall. In the case of first-price auctions, the cost of bidding higher is immediately reflected in all auctions won by the DSP. For second-price auctions, the cost of bidding higher is reflected only in those auctions with high market price which were won given the increase in the bid values.
Real Data Experiments
In this subsection, we present results of an experiment based on a real world dataset from the DSP Criteo (Diemert et al., 2017). The dataset was generated during one month of Criteo’s operation, and is composed of 16.5 M impression logs in which Criteo bid on behalf of 700 advertisers. This dataset was released to address the problem of conversion attribution modeling, while here adapt it for our purposes. In particular, we consider the “action of interest” in our model to be a click and ignore columns in the data corresponding to conversions. Criteo was bidding in a second-price market and the 16.5 M logs are all logs in which Criteo won the auction. Importantly, each impression log records the market price for that impression (the highest competing bid), nine categorical features of that impression, which campaign showed an ad, and if a click occurred or not. An ideal DSP dataset should also include the impressions lost by the DSP, but unfortunately a DSP has no access to the market price for those impressions. As is often standard in the literature (see Zhang et al., 2014 and others such as Ren et al., 2018), we assume that the distribution of winning logs is reasonably representative of the overall distribution of impression logs. In other words, we ignore the censored data problem. As long as Criteo’s bidding algorithm did a reasonably good job at acquiring impressions—which we believe it did—then the censored data problem is not a major issue for our purposes since our main interest is in showing the benefits of our framework for better trading off between profitability and budget utilization. Put another way, if we can demonstrate benefits of our methodology on Criteo’s winning logs, then such results provide evidence that our methodology would lead to similar benefits in a real operational environment for Criteo.
Let us now describe how we constructed the problem instance from the dataset. We split the data so that the first 3 weeks of data are used for training and the last week for testing. The (impression type and campaign) graph was constructed by first creating the impression types, then selecting the campaigns, and finally, choosing the edges. To create impression types, we used the CART algorithm (Breiman et al., 1984) with user clicks as labels and the Gini index as the impurity function. CART allows us to map each combination of the nine categorical features to a leaf in the CART tree, and each leaf corresponds to an impression type. Thus, CART provides a data-driven way to identify impression types such that impressions are grouped together in these impression types according to their click-through rates. Running CART requires tuning a “complexity parameter” which impacts the number of leaves of the tree. We used cross-validation to search for a complexity parameter with nearly optimal validation error while also producing a computationally treatable number of impression types (more about how CART was run in Appendix EC.7.3.1). We then filtered the campaigns to those that appeared in at least 200 impressions logs in both the train and test logs. We added an edge to only if there were at least 30 logs in which a selected campaign bid for an impression of type . Following the procedure described above, the constructed graph has 84 impression types, 649 campaigns, and 9,903 edges. We leave the details on how to obtain the campaigns’ budgets, bid landscapes, price paid per click, click-through rates, and other parameters to Appendix EC.7.3. In Appendix EC.7.3, we also explain how the simulator scheme shown in Appendix EC.7.1 was implemented for this experiment.
We run three experiments, each comparing our methodology with UPF 3, UPF 4, and UPF 5 to the generalized greedy policy described in Algorithm 3. In order to test robustness of our methodology, we tried three configurations for the price paid per click values . In the first experiment, for each campaign we multiplied the original values of by , in the second by , and in the third by . In all cases, we use the same budget values . Changing the price paid per click affects the spending of campaigns, thereby Algorithm 2 needs to re-calculate the allocation and bid vector to achieve desired trade off between profit and budget utilization. To obtain the curves shown in Figure 2, we tried different parameter values for UPF 4, UPF 5, and different values for Algorithm 3. Appendix EC.7.4 shows the confidence band plots showing results for all parameters tried.
Profit and budget utilization as we change the penalization parameter of UPF 4 and UPF 5, and the scaling parameter of Algorithm 3. The dots in the graph represent UPF 3, which has no parameters, and Algorithm 3 without scaling ().
Figure 2 again plots the total profit versus budget utilization for the four different methods. Figure 2 shows that UPF 4 and UPF 5 strictly dominate Algorithm 3 by achieving a higher profit at any budget utilization level. Furthermore, we see that this relationship holds across different budget utilization levels in the three plots. A primary reason why our methodology can dominate the generalized greedy policy is that it allows campaigns to wait for more favorable impressions types. The latter is especially important for campaigns with relative high revenue terms. When an impression arrives, our methodology can decide to allocate the impression to campaigns which do not maximize their profit, saving the budget for future opportunities.
As in the synthetic case, UPF 4 and UPF 5 perform differently from each other. Since UPF 4 only considers the target budget as a penalization term, it performs poorly for small penalization levels, which correspond to the left side of the curves for UPF 4 in the first two panels of Figure 2. In this case, the implied policies for UPF 4 overspend heavily for some campaigns, while insufficiently spending for many others. These policies obtain both low profit and low budget utilization levels as campaigns are not allowed to bid more than their target budgets. As the penalization value increases, UPF 4 improves in terms of both budget utilization and profit until the profit begins to go down, at which point the objective is simply placing too much weight on the penalty term. On the other hand, for small penalization values, UPF 5 behaves similarly to UPF 3 (they match when no penalization is present), and a clear trade off between profitability and budget utilization. Comparing UPF 4 and UPF 5 directly to each other, we see in the first two graphs that 5 is better at lower budget utilization levels and 4 is better at higher budget utilization levels. The fact that UPF 5 does not allow overspending, while UPF 4 does is the reason why the latter tends to work better for higher budget utilization levels. As in the synthetic case, employing different target budget values in the constraints than those used in the term in the objective should improve the performance of UPF 5.
Conclusion
We proposed an optimization formulation for the joint bidding and allocation problem faced by a DSP. Our approach allows the DSP to efficiently and effectively trade off between profitability and budget utilization for its advertisers and works for arbitrary auction types. Our optimization formulation is non-convex, but we employ a dual approach that leads to an efficient two-phase algorithm. We study general conditions under which there is zero duality gap and the algorithm converges to an optimal solution of the non-convex problem. Experimentally, we observe that our methodology allows a DSP to better trade off profitability and budget spending on both synthetic problems and on a problem using data from a real DSP.
Our paper focuses heavily on the properties of a static optimization model, which is most relevant for high volume advertisers and impression types, and we provide a provably efficient algorithm for solving this model. A natural extension of our model would allow for online adjustments as the environment changes, which is particularly important for new and smaller scale advertisers. Another related and important extension is to consider the effect of parameter learning during the real-time decision making process, and to develop an algorithm that balances exploitation (e.g., via our optimization model) with exploration.
Supplemental Material
sj-pdf-1-pao-10.1177_10591478251356002 - Supplemental material for Optimal Bidding, Allocation, and Budget Spending for a Demand-Side Platform With Generic Auctions
Supplemental material, sj-pdf-1-pao-10.1177_10591478251356002 for Optimal Bidding, Allocation, and Budget Spending for a Demand-Side Platform With Generic Auctions by Paul Grigas, Alfonso Lobos, ZhengWen and Kuang-Chih Lee in Production and Operations Management
Footnotes
Acknowledgments
The authors thank the editorial team for their thoughtful and constructive comments.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
PG acknowledges the support of the NSF AI Institute for Advances in Optimization, Award 2112533.
ORCID iDs
Paul Grigas
Zheng Wen
Supplemental Material
Supplemental material for this article is available online (doi: ).
How to cite this article
Grigas P, Lobos A, Wen Z and Lee K-C (2025) Optimal Bidding, Allocation, and Budget Spending for a Demand-Side Platform With Generic Auctions. Production and Operations Management xx(x): 1–19.
AggarwalGBadanidiyuruAMehtaA (2019) Autobidding with constraints. In: Web and Internet Economics: 15th International conference, WINE 2019, New York, NY, USA, December 10–12, 2019, Proceedings 15, pp.17–30. Springer.
4.
AvadhanulaVColini BaldeschiRLeonardiS, et al. (2021) Stochastic bandits for multi-platform budget optimization in online advertising. In: Proceedings of the web conference 2021, pp.2805–2817.
5.
BalseiroSDengYMaoJ, et al. (2021a) Robust auction design in the auto-bidding world. Advances in Neural Information Processing Systems34: 17777–17788.
6.
BalseiroSKroerCKumarR (2023) Contextual standard auctions with budgets: Revenue equivalence and efficiency guarantees. Management Science69(11): 6837–6854.
7.
BalseiroSLuHMirrokniV (2020a) Dual mirror descent for online allocation problems. In: International conference on machine learning, pp.613–628. PMLR.
8.
BalseiroSLuHMirrokniV (2020b) Dual mirror descent for online allocation problems. In: International conference on machine learning, pp.613–628. PMLR.
9.
BalseiroSLuHMirrokniV (2021b) Regularized online allocation problems: Fairness and beyond. In: International conference on machine learning, pp.630–639. PMLR.
10.
BalseiroSRBesbesOWeintraubGY (2015) Repeated auctions with budgets in ad exchanges: Approximations and design. Management Science61(4): 864–884.
11.
BalseiroSRFeldmanJMirrokniV, et al. (2014) Yield optimization of display advertising with ad exchange. Management Science60(12): 2886–2907.
12.
BalseiroSRGurY (2019) Learning in repeated auctions with budgets: Regret minimization and equilibrium. Management Science65(9): 3952–3968.
13.
BalseiroSRLuHMirrokniV (2022) The best of many worlds: Dual mirror descent for online allocation problems. Operations Research71(1): 101–119. DOI: https://doi.org/10.1287/opre.2021.2242.
14.
BhalgatAFeldmanJMirrokniV (2012) Online allocation of display ads with smooth delivery. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.1213–1221.
15.
BoydSVandenbergheL (2004) Convex Optimization. New York: Cambridge University Press.
16.
BreimanLFriedmanJStoneCJ, et al. (1984) Classification and Regression Trees. CRC Press.
17.
ChenYBerkhinPAndersonB, et al. (2011) Real-time bidding algorithms for performance-based display ad allocation. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.1307–1315. ACM.
18.
ConitzerVKroerCSodomkaE, et al. (2022) Multiplicative pacing equilibria in auction markets. Operations Research70(2): 963–989.
19.
CuiYZhangRLiW, et al. (2011) Bid landscape forecasting in online ad exchange marketplace. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.265–273. ACM.
20.
DengYGolrezaeiNJailletP, et al. (2023) Multi-channel autobidding with budget and roi constraints. In: International conference on machine learning, pp.7617–7644. PMLR.
21.
DiemertEMeynetJGallandP, et al. (2017) Attribution modeling increases efficiency of bidding in display advertising. In: Proceedings of the ADKDD’17, pp.1–6.
22.
GrigasPLobosAWenZ, et al. (2017) Profit maximization for online advertising demand-side platforms. In: Proceedings of the ADKDD’17, pp.11. ACM.
23.
GummadiRKeyPProutiereA (2012) Repeated auctions under budget constraints: Optimal bidding strategies and equilibria. The Eighth Ad Auction Workshop. Citeseer.
KlempererP (1999) Auction theory: A guide to the literature. Journal of Economic Surveys13(3): 227–286.
26.
LeeK-CJalaliADasdanA (2013) Real time bid optimization with smooth budget delivery in online advertising. In: Proceedings of the seventh international workshop on data mining for online advertising, pp.1. ACM.
27.
LiuHZhuMingruiMengXiaonan, et al. (2017) Dual based dsp bidding strategy and its application. arXiv preprint arXiv:1705.09416.
28.
MicchiGKhahSSTurnerJ (2020) A new optimization layer for real-time bidding advertising campaigns. Intelligent Data Analysis24(1): 199–224.
29.
NesterovY (2013) Introductory Lectures on Convex Optimization: A Basic Course. Switzerland: Springer Science & Business Media.
30.
ParikhNBoydS, et al. (2014) Proximal algorithms. Foundations and Trends in Optimization1(3): 127–239.
31.
RenKZhangWChangK, et al. (2018) Bidding machine: Learning to bid for directly optimizing profits in display advertising. IEEE Transactions on Knowledge and Data Engineering30(4): 645–659.
32.
RichardsonMDominowskaERagnoR (2007) Predicting clicks: Estimating the click-through rate for new ads. In: Proceedings of the 16th international conference on World Wide Web, pp.521–530. ACM.
33.
SusanFGolrezaeiNSchrijversO (2023) Multi-platform budget management in ad markets with non-ic auctions. arXiv preprint arXiv:2306.07352.
34.
VickreyW (1961) Counterspeculation, auctions, and competitive sealed tenders. The Journal of Finance16(1): 8–37.
35.
WangQYangZDengX, et al. (2023) Learning to bid in repeated first-price auctions with budgets. International conference on machine learning, pp.36494–36513. PMLR.
36.
WangYRenKZhangW, et al. (2016) Functional bid landscape forecasting for display advertising. In: Joint European conference on machine learning and knowledge discovery in databases, pp.115–131. Springer.
37.
XuJLeeK-CLiW, et al. (2015) Smart pacing for effective online ad campaign optimization. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp.2217–2226. ACM.
38.
YangXLiYWangH, et al. (2019) Bid optimization by multivariable control in display advertising. In: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pp.1966–1974.
39.
ZhangWYuanSWangJ (2014) Optimal real-time bidding for display advertising. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.1077–1086. ACM.
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.