Sage Journals: Discover world-class research

Abstract

Product returns are prevalent in practice. Many retailers provide lenient free return policies but with specific return window within which customers are allowed to return products. Motivated by this phenomenon, we consider a single-product online learning and pricing problem with stochastic product returns. A salient feature is that the demand function, depending on price and return window decisions, is initially unknown and must be learned on the fly. The retailer thus faces the classic exploration–exploitation trade-off. Moreover, we consider an inventory constraint, introducing an additional trade-off between earning revenue and managing inventory. We propose a modeling framework to integrate pricing and return window decisions, and develop a deterministic fluid model that serves as the full-information benchmark. To tackle the learning problem, we design a novel nonparametric learning algorithm that seamlessly integrates inverse stochastic gradient descent (SGD) and Upper Confidence Bound (UCB) methods. Under mild assumptions on demand and revenue functions, we establish a regret upper bound for our learning algorithm as $O (\sqrt{W T} \log T)$ , where $W$ denotes the number of return window candidates and $T$ denotes the time horizon. This result aligns with lower bounds established in both online pricing and multi-armed bandit (MAB) literature. Numerical experiments are conducted to verify the effectiveness and robustness of our algorithm across various environments. From an operational standpoint, retailers can use our learning framework as a decision-support tool to identify the optimal price and return window.

Keywords

Dynamic Pricing Product Returns Demand Learning Return Window Revenue Management

Get full access to this article

View all access options for this article.

References

Agrawal

Avadhanula

Goyal

, et al. (2019) Mnl-bandit: a dynamic learning approach to assortment selection. Operations Research 67(5): 1453–1485.

Altug

Aydinliyim

(2016) Counteracting strategic purchase deferrals: the impact of online retailers’ return policy decisions. Manufacturing & Service Operations Management 18(3): 376–392.

Ambilkar

Dohale

Gunasekaran

, et al. (2022) Product returns management: A comprehensive review and future research agenda. International Journal of Production Research 60(12): 3920–3944.

Anderson

Simester

(2004) Long-run effects of promotion depth on new versus established customers: Three field studies. Marketing Science 23(1): 4–20.

Auer

Cesa-Bianchi

Freund

, et al. (2002) The nonstochastic multiarmed bandit problem. SIAM Journal on Computing 32(1): 48–77.

Ban

Keskin

(2021) Personalized dynamic pricing with machine learning: high-dimensional features and heterogeneous elasticity. Management Science 67(9): 5549–5568.

Besbes

Gur

Zeevi

(2015) Non-stationary stochastic optimization. Operations Research 63(5): 1227–1244.

Besbes

Zeevi

(2009) Dynamic pricing without knowing the demand function: risk bounds and near-optimal algorithms. Operations Research 57(6): 1407–1420.

Besbes

Zeevi

(2012) Blind network revenue management. Operations Research 60(6): 1537–1550.

10.

Broder

Rusmevichientong

(2012) Dynamic pricing under a general parametric choice model. Operations Research 60(4): 965–980.

11.

Simchi-Levi

(2022) Online pricing with offline data: phase transition and inverse square law. Management Science 68(12): 8568–8588.

12.

Chen

Shi

(2025) Tailored base-surge policies in dual-sourcing inventory systems with demand learning. Operations Research 73(4): 1723–1743.

13.

Chen

Bell

(2009) The impact of customer returns on pricing and order decisions. European Journal of Operational Research 195(1): 280–295.

14.

Chen

(2015) Recent developments in dynamic pricing research: Multiple products, competition, and limited demand information. Production and Operations Management 24(5): 704–731.

15.

Chen

Gallego

(2022) A primal–dual learning algorithm for personalized dynamic pricing with an inventory constraint. Mathematics of Operations Research 47(4): 2585–2613.

16.

Chen

(2023) Frontiers in service science: data-driven revenue management: The interplay of data, model, and decisions. Service Science 15(2): 79–91.

17.

Chen

Jasin

Duenyas

(2019) Nonparametric self-adjusting control for joint learning and optimization of multiproduct pricing with finite resource capacity. Mathematics of Operations Research 44(2): 601–631.

18.

Chen

Owen

Pixton

, et al. (2022) A statistical learning approach to personalization in revenue management. Management Science 68(3): 1923–1937.

19.

Chen

Simchi-Levi

Wang

(2025) Utility fairness in contextual dynamic pricing with demand learning. Management Science Forthcoming.

20.

Chen

Shi

(2023) Network revenue management with online inverse batch gradient descent method. Production and Operations Management 32(7): 2123–2137.

21.

Cheung

Simchi-Levi

Zhu

(2022) Hedging the drift: learning to optimize under nonstationarity. Management Science 68(3): 1696–1713.

22.

den Boer

(2015) Dynamic pricing and learning: Historical origins, current research, and new directions. Surveys in Operations Research and Management Science 20(1): 1–18.

23.

Feng

Dawande

Janakiraman

, et al. (2024) Dynamic pricing and learning with discounting. Operations Research 72(2): 481–492.

24.

Ferreira

Simchi-Levi

Wang

(2018) Online network revenue management using thompson sampling. Operations Research 66(6): 1586–1602.

25.

Gong

Liang

(2019) Managing perishable inventory systems with product returns and remanufacturing. Production and Operations Management 28(6): 1366–1386.

26.

Gallego

Topaloglu

(2019) Revenue Management and Pricing Analytics. Vol. 209. New York: Springer.

27.

Gallego

Van Ryzin

(1994) Optimal dynamic pricing of inventories with stochastic demand over finite horizons. Management Science 40(8): 999–1020.

28.

Wan

Murthy

(2019) Dynamic pricing of limited inventories with product returns. Manufacturing & Service Operations Management 21(3): 501–518.

29.

Jia

Shi

Shen

(2024) Online learning and pricing for service systems with reusable resources. Operations Research 72(3): 1203–1241.

30.

Kedia

Madan

Borar

(2019) Early bird catches the worm: predicting returns even before purchase in fashion e-commerce. arXiv preprint arXiv:1906.12128.

31.

Keskin

Song

(2022) Data-driven dynamic pricing and ordering with perishable inventory in a changing environment. Management Science 68(3): 1938–1958.

32.

Keskin

Zeevi

(2014) Dynamic pricing with an unknown demand model: asymptotically optimal semi-myopic policies. Operations Research 62(5): 1142–1167.

33.

Lattimore

Szepesvári

(2020) Bandit Algorithms. Cambridge: Cambridge University Press.

34.

Liang

Jasin

Chao

(2023) Assortment and inventory planning under dynamic (stockout-based) substitution in the presence of customer returns: A fluid analysis. Available at SSRN 4468430.

35.

Liang

Jasin

Uichanco

(2025) Combining a smart pricing policy with a simple replenishment policy: managing uncertainties in the presence of stochastic purchase returns. Mathematics of Operations Research Forthcoming.

36.

Wang

(2024) Proactive return prediction in online fashion retail using heterogeneous graph neural networks. Electronics 13(7): 1398.

37.

Miao

(2023) Managing the inventory with product return: an approximation algorithm. Available at SSRN 4556309.

38.

Miao

Wang

(2025) Network revenue management with nonparametric demand learning:

\sqrt{T}

-regret and polynomial dimension dependency. Mathematics of Operations Research Forthcoming.

39.

Murfield

Boone

Rutner

, et al. (2017) Investigating logistics service quality in omni-channel retailing. International Journal of Physical Distribution & Logistics Management 47(4): 263–296.

40.

Nambiar

Simchi-Levi

Wang

(2019) Dynamic learning and pricing with model misspecification. Management Science 65(11): 4980–5000.

41.

Nestler

Karessli

Hajjar

, et al. (2021) Sizeflags: reducing size and fit related returns in fashion e-commerce. In: Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. pp.3432–3440.

42.

NRF (2023) National retail federation: 2023 consumer returns in the retail industry. https://nrf.com/research/2023-consumer-returns-retail-industry (accessed on May 23, 2024).

43.

Ofek

Katona

Sarvary

(2011) “Bricks and clicks”: the impact of product returns on the strategies of multichannel retailers. Marketing Science 30(1): 42–60.

44.

Shalev-Shwartz

(2012) Online learning and online convex optimization. Foundations and Trends in Machine Learning 4(2): 107–194.

45.

Simchi-Levi

Zhao

(2025) Blind network revenue management and bandits with knapsacks under limited switches. Operations Research 73(5): 2496–2514.

46.

Slivkins

(2019) Introduction to multi-armed bandits. Foundations and Trends in Machine Learning 12(1-2): 1–286.

47.

(2009) Consumer returns policies and supply chain performance. Manufacturing & Service Operations Management 11(4): 595–612.

48.

Wang

Chen

Simchi-Levi

(2021) Multimodal dynamic pricing. Management Science 67(10): 6136–6152.

49.

Wang

Deng

(2014) Close the gaps: a learning-while-doing algorithm for single-product revenue management problems. Operations Research 62(2): 318–331.

50.

Yuan

Luo

Shi

(2021) Marrying stochastic gradient descent with bandits: learning algorithms for inventory systems with fixed costs. Management Science 67(10): 6089–6115.

51.

Zhu

(2012) Joint pricing and inventory replenishment decisions with returns and expediting. European Journal of Operational Research 216(1): 105–112.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.23 MB

Revenue Management With Nonparametric Demand Learning and Product Returns

Abstract

Keywords

Get full access to this article

References

Supplementary Material