论文标题
自适应风险意识竞标具有预算限制,展示广告
Adaptive Risk-Aware Bidding with Budget Constraint in Display Advertising
论文作者
论文摘要
实时竞标(RTB)已成为展示广告的主要范式。用户访问产生的每个广告印象都是实时拍卖的,其中需求端平台(DSP)自动提供出价价格,通常依赖于广告印象价值估计和最佳的出价价格确定。但是,当前的投标策略忽略了用户行为的巨大随机性(例如,点击)以及拍卖竞赛引起的成本不确定性。在这项工作中,我们明确考虑了估计的AD印象值的不确定性,并通过连续决策过程在特定状态和市场环境下对DSP的风险偏好进行建模。具体而言,我们提出了一种新型的自适应风险感知竞标算法,并通过增强学习预算限制,这是第一个同时考虑估计不确定性和DSP的动态风险趋势。从理论上讲,我们基于风险的价值(VAR)的不确定性与风险趋势之间的固有关系。因此,我们提出了两种实例化,以模拟风险趋势,包括一个基于专家知识的配方,采用三个基本特性和一种基于自我监督的强化学习的自适应学习方法。我们在公共数据集上进行了广泛的实验,并表明所提出的框架在实际设置中的表现优于最先进的方法。
Real-time bidding (RTB) has become a major paradigm of display advertising. Each ad impression generated from a user visit is auctioned in real time, where demand-side platform (DSP) automatically provides bid price usually relying on the ad impression value estimation and the optimal bid price determination. However, the current bid strategy overlooks large randomness of the user behaviors (e.g., click) and the cost uncertainty caused by the auction competition. In this work, we explicitly factor in the uncertainty of estimated ad impression values and model the risk preference of a DSP under a specific state and market environment via a sequential decision process. Specifically, we propose a novel adaptive risk-aware bidding algorithm with budget constraint via reinforcement learning, which is the first to simultaneously consider estimation uncertainty and the dynamic risk tendency of a DSP. We theoretically unveil the intrinsic relation between the uncertainty and the risk tendency based on value at risk (VaR). Consequently, we propose two instantiations to model risk tendency, including an expert knowledge-based formulation embracing three essential properties and an adaptive learning method based on self-supervised reinforcement learning. We conduct extensive experiments on public datasets and show that the proposed framework outperforms state-of-the-art methods in practical settings.