Abstract
We study a production planning problem with linear, nonlinear, deterministic, and/or stochastic production relationships between the production plans and actual production quantities. We start by introducing a stochastic dynamic programing formulation of the problem with a Markovian assumption on the production relationships. Under specific conditions, we establish the convexity of the optimal cost-to-go function and closed forms of optimal policies. To solve the original problem in the general case, we propose a solution framework based on sequential policy optimization and deep reinforcement learning. We discuss the theoretical properties of the framework and evaluate its numerical performance with linear and nonlinear production relationships. In the linear case, our framework performs in line with the state-of-the-art optimization-based methods with improved computational efficiency. In the nonlinear case, our framework achieves an optimality gap near 10%–20%. We also illustrate that the proposed methodology can also be extended to the problem of joint production planning and scheduling.
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
