JOURNAL ARTICLE

Non-stochastic Budgeted Online Pricing with Semi-Bandit Feedback

Xiang LiuHau ChanMinming LiWeiwei WuLong Tran-Thanh

Year: 2025 Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Vol: 39 (18)Pages: 18978-18986   Publisher: Association for the Advancement of Artificial Intelligence

Abstract

We consider a general non-stochastic online pricing bandit setting in a procurement scenario where a buyer with a budget wants to procure items from a fixed set of sellers to maximize the buyer's reward by dynamically offering purchasing prices to the sellers, where the sellers' costs and values at each time period can change arbitrarily and the sellers determine whether to accept the offered prices to sell the items. This setting models online pricing scenarios of procuring resources or services in multi-agent systems. We first consider the offline setting when sellers' costs and values are known in advance and investigate the best fixed-price policy in hindsight. We show that it has a tight approximation guarantee with respect to the offline optimal solutions. In the general online setting, we propose an online pricing policy, Granularity-based Pricing (GAP), which exploits underlying side-information from the feedback graph when the budget is given as the input. We show that GAP achieves an upper bound of O(n{v_{max}}{c_{min}}sqrt{B/c_{min}}ln B) on the alpha-regret where n, v_{max}, c_{min}, and B are the number, the maximum value, the minimum cost of sellers, and the budget, respectively. We then extend it to the unknown budget case by developing a variant of GAP, namely Doubling-GAP, and show its alpha-regret is at most O(n{v_{max}}{c_{min}}sqrt{B/c_{min}}ln2 B). We also provide an alpha-regret lower bound Omega(v_{max}sqrt{Bn/c_{min}}) of any online policy that is tight up to sub-linear terms. We conduct simulation experiments to show that the proposed policy outperforms the baseline algorithms.

Keywords:
Computer science Economics Econometrics Mathematical optimization Mathematics

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.24
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Advanced Bandit Algorithms Research
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Auction Theory and Applications
Social Sciences →  Decision Sciences →  Management Science and Operations Research
Smart Grid Energy Management
Physical Sciences →  Engineering →  Electrical and Electronic Engineering

Related Documents

JOURNAL ARTICLE

Online Influence Maximization With Semi-Bandit Feedback Under Corruptions

Xiaotong ChengBehzad Nourani-KolijiSetareh Maghsudi

Journal:   IEEE Transactions on Network Science and Engineering Year: 2025 Vol: 12 (3)Pages: 2308-2321
JOURNAL ARTICLE

Distributed Online Stochastic-Constrained Convex Optimization With Bandit Feedback

Cong WangShengyuan XuDeming Yuan

Journal:   IEEE Transactions on Cybernetics Year: 2022 Vol: 54 (1)Pages: 63-75
JOURNAL ARTICLE

Online Second Price Auction with Semi-Bandit Feedback under the Non-Stationary Setting

Haoyu ZhaoWei Chen

Journal:   Proceedings of the AAAI Conference on Artificial Intelligence Year: 2020 Vol: 34 (04)Pages: 6893-6900
JOURNAL ARTICLE

Stochastic Convex Optimization with Bandit Feedback

Alekh AgarwalDean P. FosterDaniel HsuSham M. KakadeAlexander Rakhlin

Journal:   SIAM Journal on Optimization Year: 2013 Vol: 23 (1)Pages: 213-240
© 2026 ScienceGate Book Chapters — All rights reserved.