CMDRL: A cavity-aware deep reinforcement learning framework with spatiotemporal attention for 3D bin Packing

Abstract

The three-dimensional bin packing problem (3D-BPP) is a classic optimization challenge in logistics. It aims to maximize space utilization and transportation efficiency by optimizing how items are arranged within containers. Conventional approaches typically rely on manually crafted heuristic rules; however, such rules often struggle to capture the rich three-dimensional spatial relationships between items and the container, leading to suboptimal packing solutions. Motivated by the success of deep reinforcement learning (DRL) in complex sequential decision-making, this paper proposes Cavity-Map-based Deep Reinforcement Learning (CMDRL) for 3D-BPP. First, we introduce a cavity-map representation of the packing state. By incorporating features such as the number of faces and proximity, the cavity map more precisely encodes 3D geometric relationships among items and between items and the container, addressing limitations in existing geometric representations. We then develop an enhanced spatiotemporal attention mechanism that dynamically fuses the temporal sequence of arriving items with the container’s evolving spatial layout, thereby improving both item selection and placement decisions. Experimental results demonstrate that our method consistently reduces gap ratios across diverse packing scenarios, while outperforming state-of-the-art DRL baselines in both efficiency and scalability. Ablation studies further confirm the contributions of the cavity map and the enhanced spatiotemporal attention mechanism, showing that their combination yields substantial improvements in packing performance. Overall, this work advances research on 3D packing and offers a practical solution for logistics and other complex spatial optimization problems.

Keywords

3D-BPP deep reinforcement learning cavity map spatiotemporal attention mechanism gap ratio

Get full access to this article

View all access options for this article.

References

Crainic

Perboli

Tadei

Extreme point-based heuristics for three-dimensional bin packing. INFORMS J Comput 2008; 20(3): 368–384.

Goh

, et al. Three-dimensional bin packing problem with variable bin height. Eur J Oper Res 2010; 202(2): 347–355.

Zhao

Bennell

Bektaş

, et al. A comparative review of 3D container loading algorithms. [in special issue: Cutting and packing]. Int Trans Oper Res 2016; 23(1–2): 287–320.

Gehring

Ortfeldt

A genetic algorithm for solving the container loading problem. Blackwell Publishing Ltd, 1997.

Zhang

A hybrid differential evolution algorithm for multiple container loading problem with heterogeneous containers. Comput Ind Eng 2015; 90: 305–313.

Sutskever

Vinyals

QV.

Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems (NeurIPS), arXiv:1409.3215, 2014.

Norouzi

and Bengio S

. Neural combinatorial optimization with reinforcement learning. arXiv:1611.09940, 2016.

Chen

, et al. TAP-Net: transport-and-pack using reinforcement learning. ACM Trans Graph 2020; 39(6): 1–15.

Zhang

Attend2Pack: bin packing through deep reinforcement learning with attention. arXiv preprint arXiv:2107.04333, 2021.

10.

Zhao

Lin

A dynamic multi-modal deep reinforcement learning framework for 3D bin packing problem. Knowl Syst 2024; 299: 111990.

11.

Padberg

Packing small boxes into a big box. Math Methods Oper Res 2000; 52(1): 1–21.

12.

Martello

Pisinger

Vigo

The three-dimensional bin packing problem. Oper Res 2000; 48(2): 256–267.

13.

George

Robinson

DF.

A heuristic for packing boxes into a container. Comput Oper Res 1980; 7: 147–156.

14.

Pisinger

Heuristics for the container loading problem. Eur J Oper Res 2002; 141(2): 382–392.

15.

Harrath

A three-stage layer-based heuristic to solve the 3D bin-packing problem under balancing constraint. J King Saud Univ - Comput Inf Sci 2022; 34(8, Part B): 6425–6431.

16.

Gehring

Ortfeldt

A genetic algorithm for solving the container loading problem. Int Trans Oper Res 1997; 4(5–6): 401–418.

17.

Zhu

Oon

Lim

, et al. The six elements to block-building approaches for the single container loading problem. Appl Intell 2012; 37(3): 431–445.

18.

Bortfeldt

Gehring

Mack

A parallel tabu search algorithm for solving the container loading problem. Parallel Comput 2003; 29(5): 641–662.

19.

Parreño

Alvarez-Valdes

Tamarit

, et al. A maximal-space algorithm for the container loading problem. INFORMS J Comput 2008; 20(3): 412–422.

20.

Kang

Moon

Wang

A hybrid genetic algorithm with a new packing strategy for the three-dimensional bin packing problem. Appl Math Comput 2012; 219: 1287–1299. https://doi.org/10.1016/j.amc.2012.07.036

21.

Araya

Moyano

Sanchez

A beam search algorithm for the biobjective container loading problem. Eur J Oper Res 2020; 286(2): 417–431.

22.

Zhang

Yan

, et al. Solving a new 3d bin packing problem with deep reinforcement learning method. arXiv preprint arXiv:1708.05930, 2017.

23.

Laterre

Jabri

, et al. Ranked reward: enabling self-play reinforcement learning for combinatorial optimization. arXiv preprint arXiv:1807.01672, 2018.

24.

Duan

Qian

, et al. A multi-task selected learning approach for solving 3D flexible bin packing problem. arXiv preprint arXiv:1804.06896, 2018.

25.

Wang

, et al. One model packs thousands of items with recurrent conditional query learning. Knowl Based Syst 2022; 235: 107683.

26.

Jiang

Cao

Zhang

Solving 3D bin packing problem via multimodal deep reinforcement learning. In: Proceedings of the 20th international conference on autonomous agents and multiagent systems, 2021, pp. 1548–1550.

27.

Que

Yang

Zhang

Solving 3D packing problem using transformer network and reinforcement learning. Expert Syst Appl 2023; 214(1): 119–153.

28.

Kundu

Dutta

Kumar

Deep-pack: a vision-based 2d online bin packing algorithm with deep reinforcement learning. 2019 28th IEEE international conference on robot and human interactive communication (RO-MAN), 2019, pp. 1–7. IEEE.

29.

Zhao

She

Zhu

, et al. Online 3D bin packing with constrained deep reinforcement learning. Proc AAAI Conf Artif Intell 2021; 35(1): 741–749.

30.

Zhao

Zhu

, et al. Learning practically feasible policies for online 3D bin packing. Sci China Inf Sci 2022; 65(1): 1–17.

31.

Ramos

Oliveira

Gonçalves

, et al. A container loading algorithm with static mechanical equilibrium stability constraints. Transp Res Part B: Methodol 2016; 91: 565–581.

32.

Kingma

Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

33.

Ren

, et al. Solving packing problems by conditional query learning. OpenReview, 2020.

34.

Jiang

Cao

Zhang

Solving 3D bin packing problem via multimodal deep reinforcement learning. In: Proceedings of the 20th international conference on autonomous agents and multiagent systems, 2021, pp. 1548–1550.

35.

Guo

Qiu

Guo

, et al. A novel MOEA/D with Q-Learning initialization for dual-resource constrained flexible job shop scheduling problem with limited multi-skilled workers. Proc Inst Mech Eng Part B: J Eng Manuf 2025; 09544054251360043. https://doi.org/10.1177/09544054251360043

36.

Yang

Shu

, et al. Dynamic flexible job shop scheduling based on deep reinforcement learning. Proc Inst Mech Eng Part B: J Eng Manuf 2025; 239(9). 1251–1264. https://doi.org/10.1177/09544054241272855

37.

Bao

Zheng

Dai

A digital twin-driven dynamic path planning approach for multiple automatic guided vehicles based on deep reinforcement learning. Proc Inst Mech Eng Part B: J Eng Manuf 2024; 238(4). 488–499. https://doi.org/10.1177/09544054231180513