Abstract
Heavy-oil reservoirs operating under solution-gas drive may exhibit foamy-oil flow behavior, in which dispersed gas and delayed gas mobility enhance oil recovery beyond conventional expectations. However, predicting foamy-oil production remains challenging because of complex multiphase transport processes and strong sensitivity to operational conditions, particularly pressure depletion rate. To address this challenge, this study develops a simulation-informed machine-learning surrogate framework for rapid and interpretable prediction of foamy-oil production under controlled pressure depletion conditions. A calibrated thermal–compositional model was constructed in CMG-STARS using laboratory depletion experiments conducted in a 2-m sand-pack system. A simulation-based design-of-experiments (DOE) approach was then employed to generate datasets spanning realistic ranges of fluid properties, relative permeability characteristics, and foamy-oil kinetic parameters. Gradient-boosting machine-learning models were trained to reproduce key production responses, including oil rate, gas rate, gas–oil ratio, and cumulative recovery. The resulting surrogate models achieved high predictive accuracy, with coefficients of determination exceeding 0.95 and average prediction errors below 5%, while reducing computational time by several orders of magnitude compared with full-physics simulations. Explainable machine-learning analysis was further applied to quantify the relative importance of governing parameters. The results indicate that pressure depletion rate is the dominant control on production behavior, followed by gas liberation kinetics and critical gas saturation. The proposed framework demonstrates how simulation-informed surrogate modeling combined with explainable machine learning can provide both rapid prediction capability and transparent sensitivity analysis for complex foamy-oil production systems. The workflow therefore enables efficient scenario evaluation and provides a practical decision-support tool for forecasting and optimizing foamy-oil production strategies.
Keywords
Introduction
Heavy oil and extra-heavy oil reservoirs, commonly characterized by viscosities that can reach 10^5–10^6 cP, often exhibit production behavior that departs from conventional solution–gas-drive expectations (Sheng et al., 1999). In many such systems, gas does not immediately segregate into a continuous free-gas phase when pressure falls below the bubble point. Instead, gas may remain dispersed within the oil, forming what is widely termed foamy oil (Claridge and Prats, 1995; Maini, 2001). This dispersed-gas state is frequently associated with suppressed gas–oil ratios (GOR), delayed gas mobility, sustained oil production rates, and higher primary recovery factors than predicted by equilibrium two-phase flow theory (Smith, 1988; Maini and Busahmin, 2010; Claridge and Prats, 1995). Reviews and field observations have highlighted the prevalence of foamy-oil-like behavior in heavy-oil developments, particularly in Canadian operations, and emphasized that its governing mechanisms remain only partially resolved due to strong coupling among phase behavior, nucleation, coalescence, and porous-medium transport processes (Sheng et al., 1999; Claridge and Prats, 1995). A persistent challenge in foamy-oil research is the difficulty of translating qualitative descriptions—such as delayed gas breakout and dispersed gas flow—into quantitative predictive models that remain robust across operating conditions (Smith, 1988). Laboratory depletion tests using sand-pack systems provide a controllable environment to investigate this behavior because initial conditions, boundary conditions, and pressure depletion rates can be prescribed, allowing mechanistic hypotheses to be evaluated against measured oil and gas production responses (Busahmin et al., 2017a; Sheng et al., 1999). However, even in controlled experiments, the production response remains highly sensitive to both fluid properties and operational strategies, particularly the pressure decline (drawdown) rate, which can alter the balance between gas nucleation and retention versus the formation of a connected gas phase (Busahmin et al., 2017a). Physics-based numerical simulation therefore plays a critical role in integrating these coupled processes into a consistent predictive framework. Commercial reservoir simulators are commonly employed to history match depletion behavior by incorporating heavy-oil PVT descriptions, relative permeability formulations, and empirical or semi-empirical representations of dispersed gas and delayed gas mobility (Sheng et al., 1999). A representative example is the modeling of a long sand-pack heavy crude oil depletion experiment using methane gas, where a CMG-STARS-based workflow was applied to reproduce measured depletion responses and evaluate the impact of governing parameters (Busahmin et al., 2017b). Nevertheless, simulator-based history matching is typically computationally intensive and may suffer from parameter nonuniqueness. Two fluid and transport aspects are repeatedly implicated in foamy-oil behavior and therefore warrant explicit consideration in modeling studies: interfacial or surface tension effects and compressibility-related non-Darcy flow behavior. Surface and interfacial tension influence gas bubble nucleation, growth, and stability, thereby affecting the persistence of dispersed gas within the oil phase during depletion (Firoozabadi and Katz, 1979). Laboratory measurements conducted on mineral and crude oil systems under foamy conditions have demonstrated that surface-tension differences can be significant and should therefore be treated as important inputs in modeling and interpretation of foamy-oil flow behavior (Busahmin and Maini, 2019). In addition, foamy-oil systems may exhibit transport behavior that deviates from standard Darcy flow assumptions, where the compressibility of the dispersed gas–oil mixture contributes to anomalous flow responses and recovery trends. The compressibility-related parameter proposed for non-Darcy two-phase flow behavior provides a useful framework for interpreting prolonged foamy states and altered depletion dynamics under specific pressure-decline regimes (Busahmin and Maini, 2018). In this study, machine learning is not used as a replacement for physics-based simulation but rather as a simulation-informed surrogate modeling approach. The surrogate models are trained on datasets generated from calibrated numerical simulations that encode the relevant multiphase flow physics and gas-transition mechanisms. Consequently, the predictive capability of the surrogate model remains grounded in the physical assumptions embedded within the simulator while enabling significantly faster evaluation of alternative depletion scenarios. A particularly practical approach for reservoir and porous-media applications is the development of surrogate (emulator) models trained on design-of-experiments (DOE) datasets generated from validated numerical simulations. Such simulation-trained surrogate models have been widely applied in reservoir engineering to accelerate computational workflows while maintaining consistency with physics-based simulation results (Razavi et al., 2012). Once trained, these surrogate models can predict production responses—including oil rate, gas rate, cumulative recovery, GOR evolution, and pressure behavior—several orders of magnitude faster than full-physics numerical simulation. This capability enables rapid screening of pressure-depletion strategies and facilitates systematic uncertainty and sensitivity analyses. Despite extensive research on foamy-oil behavior in heavy-oil reservoirs, most previous studies have focused either on laboratory depletion experiments or on physics-based numerical simulations aimed at reproducing observed production responses. While these approaches have substantially improved the mechanistic understanding of dispersed-gas flow and delayed gas mobility, they remain computationally intensive and often provide limited transparency regarding the relative influence of governing parameters on production behavior. Consequently, there remains a need for computationally efficient approaches that preserve the physical fidelity of reservoir simulation while enabling rapid evaluation of production responses and parameter sensitivities. The present study addresses this gap by developing a simulation-informed machine-learning surrogate framework for predicting foamy-oil production behavior under controlled pressure depletion conditions. By combining calibrated numerical simulations with gradient-boosted surrogate models and explainable machine-learning analysis, the proposed methodology enables rapid prediction of production responses while simultaneously identifying the dominant operational and flow parameters controlling foamy-oil recovery. This integrated approach provides both computational efficiency and improved interpretability, offering a practical decision-support tool for evaluating pressure depletion strategies in heavy-oil reservoirs. However, a major barrier to the adoption of machine-learning models in reservoir engineering remains the perception of such models as “black boxes,” which can be problematic in development decision-making contexts that require physical justification and interpretability. This motivates the use of explainable machine-learning approaches that explicitly quantify the contribution of individual input parameters to model predictions (Lundberg and Lee, 2017). Among these approaches, Shapley Additive Explanations (SHAP) provides a theoretically grounded framework for attributing feature importance to predicted outputs by decomposing model predictions into additive contributions from each input variable. This enables parameter sensitivity to be evaluated both globally across the dataset and locally for specific operating conditions. Accordingly, this study proposes an integrated workflow for rapid prediction and parameter sensitivity analysis of foamy-oil production under controlled pressure depletion conditions. Building upon calibrated CMG-STARS modeling of long sand-pack depletion experiments (Busahmin et al., 2017b) and supported by laboratory evidence on surface-tension effects (Busahmin and Maini, 2019) and compressibility-related transport behavior (Busahmin and Maini, 2018), the proposed framework integrates physics-based simulation with machine-learning surrogate modeling and explainable machine-learning analysis. Specifically, the workflow (a) generates a structured simulation dataset spanning pressure depletion rates and key uncertain parameters, (b) trains a high-fidelity gradient-boosting surrogate model capable of reproducing simulator outputs with substantially reduced computational cost, and (c) applies explainable machine-learning techniques to identify and rank the dominant operational and physical parameters governing foamy-oil recovery. By enabling both rapid prediction and transparent interpretation of parameter influence, the proposed methodology provides an efficient decision-support tool for evaluating depletion strategies in heavy-oil reservoirs. Surrogate modeling and proxy-based approaches have been widely used in reservoir engineering to accelerate computationally intensive simulation workflows and support uncertainty analysis and optimization studies (Razavi et al., 2012; Mohaghegh, 2011; Chen and Guestrin, 2016; Friedman, 2001; Breiman, 2001; Hastie et al., 2009). These approaches have been successfully used to approximate complex multiphase flow simulations, enabling rapid evaluation of production strategies and sensitivity analysis while preserving consistency with physics-based models (Breiman, 2001; Hastie et al., 2009; Ribeiro et al., 2016). However, relatively few studies have applied simulation-informed machine-learning surrogates to analyze foamy-oil production behavior or to explicitly quantify parameter influence under different pressure depletion regimes. Pressure depletion strategy represents a controllable operational variable with direct implications for recovery efficiency and production planning in heavy-oil systems. By integrating physics-based simulation with explainable surrogate modeling, the proposed approach enhances prediction efficiency, parameter transparency, and decision support for energy production under solution–gas-drive conditions.
Methodology
Experimental basis and modeling strategy
The methodology adopted in this study integrates physics-based numerical simulation with simulation-informed machine-learning surrogate modeling to analyze foamy-oil production behavior under controlled pressure depletion rates. The workflow is grounded in long sand-pack depletion experiments representative of unconsolidated heavy-oil systems, in which pressure drawdown rate is treated as a key operational control variable. Physics-based numerical modeling is first employed to reproduce experimentally observed production responses using a calibrated reservoir simulation model. The resulting simulation outputs are then used to generate structured datasets through a design-of-experiments approach, which serve as training data for machine-learning surrogate models. These surrogate models enable rapid prediction of production responses and facilitate systematic parameter sensitivity analysis. Finally, explainable machine-learning techniques are applied to quantify the relative influence of governing parameters on production behavior and to improve model interpretability. Although the surrogate models are trained using sand-pack-based simulations, the proposed workflow itself is general and can be extended to field-scale applications provided that representative physics-based simulations are available for training.
Numerical model description
A one-dimensional numerical model representing a 2 m long sand-pack was constructed using the CMG-STAR simulator. The model geometry, grid resolution, and boundary conditions were selected to replicate laboratory depletion tests conducted under solution gas drive conditions. The sand-pack was initialized at connate water saturation and fully saturated with heavy crude oil containing dissolved methane gas. Production was simulated by imposing a controlled pressure decline at the outlet while maintaining no-flow conditions at the inlet, consistent with depletion-test procedures. The numerical formulation accounts for multiphase flow of oil, gas, and water, incorporating heavy-oil PVT behavior and solution gas liberation. Foamy oil behavior was represented using a dispersed gas approach, in which liberated gas initially exists as a dispersed phase within the oil before transitioning to a free gas phase, controlled through kinetic reaction terms.
Relative permeability and saturation functions
Three-phase relative permeability relationships were defined using Corey-type formulations. End-point relative permeabilities, critical saturations, and Corey exponents were selected based on experimental calibration and adjusted for each depletion rate to capture observed production trends. Table 1 summarizes the key relative permeability parameters and saturation endpoints used in the model. Connate water saturation was fixed at 0.08 for all cases, while critical water saturation and irreducible oil saturation were maintained at 0.10 and 0.30, respectively. Oil relative permeability at connate water decreased systematically with declining depletion rate, reflecting increased flow resistance and enhanced gas retention under slower drawdown conditions. Gas relative permeability at residual oil was kept low, consistent with delayed gas mobility typically observed in foamy oil systems. The Corey exponents for water and oil phases were fixed at 2, while the gas exponent was increased to 3 for intermediate depletion rates to better reproduce delayed gas breakthrough behavior.
Input parameters for crude oil system used in the numerical model under different pressure depletion rates.
Foamy oil representation and kinetic parameters
Foamy oil behavior was modeled using a kinetic transition mechanism between solution gas, dispersed gas, and free gas phases. Two reaction factors were introduced: Reaction factor 1, governing the conversion of solution gas into dispersed gas. Reaction factor 2, governing the conversion of dispersed gas into free gas. These parameters effectively control the persistence of dispersed gas within the oil phase and the timing of gas coalescence. Higher reaction factor values promote faster gas-phase transition, while lower values enhance foamy oil stability. As shown in Table 1, reaction factors were varied systematically with depletion rate. Faster depletion rates required higher solution-to-dispersed gas conversion factors to match rapid gas nucleation, while slower depletion rates exhibited reduced transition rates, consistent with prolonged gas dispersion and suppressed GOR observed experimentally.
Pressure depletion scenarios
Four pressure depletion rates—0.434, 0.226, 0.048, and 0.023 psi/min—were simulated to span rapid to quasi-static depletion regimes. These rates were selected to investigate the sensitivity of foamy oil behavior to operational drawdown strategy. Each depletion scenario was simulated independently, and production responses including oil rate, gas rate, cumulative recovery, GOR evolution, and pressure response were recorded. History matching was performed by adjusting relative permeability endpoints and kinetic reaction factors within physically reasonable bounds until satisfactory agreement with experimental production data was achieved. The objective function minimized the weighted error between simulated and measured oil rate, gas rate, and pressure profiles.
Simulation dataset generation for machine learning
Following successful history matching, a design-of-experiments (DOE) approach was implemented to generate a structured simulation dataset for machine learning training. Key uncertain parameters—including depletion rate, oil and gas relative permeability endpoints, critical gas saturation, gas Corey exponent, and kinetic reaction factors—were sampled within bounded ranges derived from the calibrated models. Each DOE realization was simulated using CMG-STARS, producing a database of input–output pairs. Inputs consisted of operational and model parameters, while outputs included time-dependent oil and gas production responses and cumulative recovery metrics.
Machine learning surrogate modeling
The generated simulation dataset was used to train machine-learning surrogate models capable of reproducing CMG-STARS outputs at a fraction of the computational cost. Gradient-boosted decision tree algorithms were adopted due to their robustness in capturing nonlinear relationships and handling mixed-scale input parameters. The generated dataset was randomly divided into training, validation, and testing subsets to ensure reliable model development and evaluation. Approximately 70% of the simulation cases were used for model training, 15% for validation during hyperparameter tuning, and the remaining 15% were reserved as an independent testing dataset. The validation dataset was used to monitor model performance during training and guide hyperparameter selection, while the testing dataset was used exclusively for final model performance assessment. The gradient-boosted decision tree models were implemented within a supervised learning framework in which hyperparameters were optimized using grid-search cross-validation. Key hyperparameters, including learning rate, maximum tree depth, number of boosting iterations, and subsampling ratio, were systematically varied to identify configurations that minimized prediction error on the validation dataset. This procedure ensured that the surrogate models captured the nonlinear relationships present in the simulation dataset while avoiding excessive model complexity. Several strategies were adopted to minimize the risk of model overfitting. Early stopping criteria were applied during training by monitoring validation error, and model complexity was controlled through limits on tree depth and learning rate. Model performance was then evaluated using the independent testing dataset to verify that predictive accuracy was maintained outside the training subset. The close agreement between surrogate predictions and numerical simulation results across the evaluated depletion scenarios indicates that the trained models generalize well within the parameter ranges represented in the simulation dataset. Separate surrogate models were developed for key production responses, including cumulative oil recovery, peak oil rate, gas breakthrough time, and GOR evolution. Model performance was quantified using the coefficient of determination (R2) and root-mean-square error (RMSE). These metrics allow direct comparison between surrogate predictions and numerical simulation outputs.
Unlike reduced-order proxy models that rely on fixed functional forms or simplified parameterizations, the proposed surrogate framework captures nonlinear parameter interactions while enabling explicit sensitivity ranking through explainable machine-learning analysis. It should be noted that the machine-learning models used in this study do not explicitly embed physical governing equations within the learning algorithm. Instead, physical consistency is preserved indirectly because the training dataset is generated from a calibrated physics-based reservoir simulator. Consequently, the surrogate model inherits the physical relationships encoded in the simulator while providing significantly faster predictions once trained. Input features used for surrogate training were selected based on their physical relevance to foamy-oil production behavior and their role in the calibrated numerical model. These features include operational parameters (pressure depletion rate), relative permeability endpoints, critical gas saturation, Corey exponents, and kinetic gas-transition parameters. Feature importance was subsequently evaluated using SHAP-based sensitivity analysis, allowing redundant or low-impact parameters to be identified during model interpretation.
Explainable machine learning and sensitivity analysis
To ensure interpretability and engineering relevance, explainable machine learning (XML) techniques were applied to the trained surrogate models. SHAP analysis was employed to quantify the contribution of each input parameter to model predictions, both globally across the dataset and locally for individual pressure depletion scenarios. This approach enabled identification of dominant controls on foamy-oil recovery and clarified how parameter importance evolves with depletion rate. In particular, the relative influence of kinetic gas-transition parameters versus relative permeability characteristics was examined to distinguish mechanisms governing early-time production enhancement from those controlling long-term recovery behavior. SHAP values were computed for each output variable and aggregated across depletion-rate scenarios to quantify both global parameter importance and depletion-rate-dependent sensitivity trends.
Workflow summary
The overall methodology follows a hierarchical and integrated framework in which physics-based numerical simulation is first employed to reproduce experimentally observed foamy-oil behavior under controlled pressure depletion conditions. A structured parameter sampling strategy is then used to generate a simulation-derived dataset that captures the combined effects of operational, fluid, and flow-related uncertainties. Based on this dataset, surrogate machine-learning models are trained to enable rapid prediction of production responses across a wide range of depletion scenarios. Finally, explainable machine-learning techniques are applied to quantify parameter sensitivity and to identify the dominant controls governing foamy-oil production behavior. This integrated workflow preserves the physical rigor required for reservoir development applications while enabling fast and interpretable evaluation of foamy-oil production under controlled pressure depletion strategies. In this framework, the surrogate models act as simulation-trained proxies that approximate the response of the physics-based simulator rather than replacing the underlying physical model.
Results and discussion
Effect of pressure depletion rate on foamy oil production behavior
The numerical simulations demonstrate a strong dependence of foamy-oil production behavior on the imposed pressure depletion rate (Figures 1–4). At the highest depletion rate of 0.434 psi/min, oil production is initially high due to rapid pressure reduction; however, the accelerated pressure decline promotes early gas liberation and a faster transition from dispersed gas to a continuous free-gas phase. This behavior is reflected by an earlier increase in GOR and a faster decline in oil production rate. Although dispersed gas formation is initiated, the high drawdown rate limits the residence time of gas bubbles within the oil phase, reducing the persistence of the foamy-oil state. When the depletion rate is reduced to 0.226 psi/min, a noticeable delay in gas breakthrough occurs. Oil production remains sustained over a longer period and is accompanied by suppressed GOR values, indicating enhanced stability of dispersed gas within the oil phase. Under these conditions, gas nucleation and dispersion continue while the transition to a continuous gas phase is delayed, allowing the oil phase to retain entrained gas and maintain effective mobility. At the lowest depletion rates of 0.048 and 0.023 psi/min, foamy-oil behavior becomes more pronounced. Gas liberation occurs gradually, and the dispersed gas phase persists over a longer depletion interval. The simulations show delayed gas production, sustained oil rates, and consistently lower GOR throughout most of the depletion process. These results indicate that slower pressure drawdown favors the development and maintenance of foamy-oil flow by limiting bubble coalescence and delaying the formation of a continuous gas pathway. Overall, the results highlight the fundamental role of depletion rate in controlling the balance between gas nucleation, dispersion, and coalescence during solution-gas-drive production. Under slow pressure depletion conditions, the dispersed gas phase remains stable for longer periods, allowing gas to remain entrained within the oil phase and delaying the formation of a continuous gas pathway. This mechanism explains the observed suppression of GOR and the sustained oil production associated with foamy-oil flow. The simulated trends are consistent with experimental observations reported in heavy-oil sand-pack depletion studies, where slower pressure decline rates typically result in improved recovery efficiency during primary production.

Pressure depletion profiles used in the numerical simulations for four controlled pressure decline rates.

Oil production rate profiles under different controlled pressure depletion rates obtained from the numerical simulation.

Gas production rate profiles under different controlled pressure depletion rates obtained from the numerical simulations.

Gas–oil ratio evolution under different controlled pressure depletion rates obtained from the numerical simulations.
The corresponding oil production responses for the different depletion rates are shown in Figure 2, where slower pressure drawdown results in sustained oil rates over extended depletion periods.
The corresponding gas production responses are shown in Figure 3, where faster pressure depletion results in early gas breakthrough, while slower drawdown significantly delays gas production due to sustained dispersed gas retention within the oil phase.
The evolution of GOR shown in Figure 4 further confirms that slower pressure drawdown suppresses gas mobility and delays gas breakthrough, which is a defining characteristic of foamy oil flow.
The cumulative oil recovery trends shown in Figure 5 indicate that slower pressure depletion enhances ultimate recovery by maintaining foamy oil flow over an extended depletion interval.

Cumulative oil production profiles under different controlled pressure depletion rates obtained from the numerical simulations.
Role of relative permeability and saturation parameters
The relative sensitivity of key model parameters influencing foamy-oil production behavior is summarized in Figure 6. The results indicate that pressure depletion rate is the dominant controlling factor, exerting the strongest influence on oil recovery and gas production response under solution gas drive. This finding highlights the importance of operational strategy, as pressure drawdown directly affects the kinetics of gas liberation, bubble growth, and coalescence, thereby controlling the transition from dispersed gas to free gas and the onset of gas mobility. The SHAP-based sensitivity analysis provides both global and scenario-dependent interpretation of parameter influence. The global ranking shown in Figure 6 represents the average contribution of each parameter across the entire simulation dataset, allowing dominant controls on production behavior to be identified. However, the relative importance of parameters varies across depletion regimes. Under rapid depletion conditions, pressure drawdown rate dominates the production response due to accelerated gas liberation. In contrast, under slower depletion conditions, the influence of kinetic gas-transition parameters becomes more pronounced because the persistence of dispersed gas governs long-term recovery performance. Among the model parameters, the kinetic gas-transition terms exhibit the next highest sensitivity. Reaction factor 1, which governs the conversion of solution gas to dispersed gas, primarily influences early-time production behavior by controlling the rate at which gas nucleates within the oil phase. Higher values of this parameter accelerate gas liberation and promote earlier gas mobility, whereas lower values favor gradual gas release and prolonged dispersed-gas retention. Reaction factor 2, associated with the conversion of dispersed gas to free gas, mainly affects late-time production behavior by delaying gas coalescence and suppressing free gas flow. The sensitivity ranking therefore confirms that these kinetic parameters are essential for reproducing key characteristics of foamy-oil behavior, including delayed gas breakthrough and suppressed GOR. Critical gas saturation also emerges as an important parameter, particularly under slower depletion conditions. An increase in critical gas saturation raises the threshold required for gas mobility, effectively extending the foamy-oil regime and delaying gas production. This parameter acts as a macroscopic representation of gas dispersion and connectivity effects within the porous medium and reinforces the nonequilibrium nature of foamy-oil flow. In contrast, relative permeability parameters, including oil relative permeability at connate water and the gas relative permeability exponent, exhibit comparatively lower sensitivity. While these parameters influence phase mobility and flow resistance, their impact is secondary to depletion rate and gas-transition kinetics within the range of conditions examined. The sensitivity analysis also reveals a shift in controlling mechanisms during depletion. Early-time production behavior is primarily governed by pressure depletion rate and solution-to-dispersed gas conversion kinetics, whereas late-time recovery becomes increasingly influenced by parameters controlling gas mobility and coalescence. This transition helps explain why conventional history-matching approaches often encounter parameter nonuniqueness, as multiple parameter combinations may reproduce similar production profiles without uniquely representing the underlying physical processes. From a reservoir development perspective, the sensitivity results emphasize that optimization of pressure depletion represents the most effective operational lever for improving foamy-oil performance. While accurate representation of gas kinetics and saturation thresholds remains important for reliable forecasting, the dominant role of depletion rate indicates that development decisions related to drawdown control can significantly influence recovery efficiency. The explainable machine-learning framework therefore provides both mechanistic insight and practical guidance for evaluating pressure depletion strategies in heavy-oil reservoirs operating under solution gas drive.

Global sensitivity ranking of model parameters based on normalized SHAP importance values derived from the machine-learning surrogate model.
The variation of parameter sensitivity with depletion rate is illustrated in Figure 7, which shows that the relative importance of model parameters changes across depletion regimes. Under slower pressure depletion conditions, gas transition kinetics become increasingly dominant because the persistence of dispersed gas governs long-term recovery behavior. As depletion rate increases, the influence of kinetic parameters gradually decreases, while the relative permeability parameters and critical gas saturation become slightly more influential due to earlier gas liberation and increased gas mobility. These results indicate that the controlling mechanisms of foamy-oil production evolve during depletion, with kinetic gas-transition processes governing slow depletion scenarios and mobility-related parameters becoming more relevant under faster drawdown conditions.

Variation of normalized sensitivity indices for key model parameters as a function of pressure depletion rate based on SHAP analysis of the machine-learning surrogate model.
Kinetic gas transition behavior and its impact on foamy oil flow
The kinetic transition of gas between solution, dispersed, and free phases plays a central role in governing foamy oil behavior under solution gas drive. In the present study, this behavior is represented through two reaction parameters controlling the conversion of solution gas to dispersed gas and the subsequent transition of dispersed gas to free gas. The calibrated values of these parameters, summarized in Table 1, vary systematically with pressure depletion rate and provide quantitative insight into the stability of foamy oil flow across different depletion regimes. At high pressure depletion rates, rapid pressure reduction promotes accelerated gas nucleation and bubble growth within the oil phase. This behavior is reflected by relatively higher values of the solution-to-dispersed gas reaction factor, indicating faster liberation of gas from solution. However, the accompanying increase in the dispersed-to-free gas reaction factor leads to rapid gas coalescence and early formation of a connected gas phase. As a result, the dispersed gas state is short-lived, gas mobility increases early, and the foamy oil regime collapses quickly. This kinetic response explains the early gas breakthrough and elevated GOR observed under rapid drawdown conditions. As the depletion rate is reduced, both kinetic reaction factors decrease, indicating a slower gas evolution process. The reduced solution-to-dispersed gas conversion rate limits rapid bubble formation, while the lower dispersed-to-free gas conversion rate delays bubble coalescence and suppresses the development of a continuous gas phase. This combination favors prolonged gas dispersion within the oil phase and enhances the persistence of foamy oil flow. The sensitivity trends shown in Figures 6 and 7 confirm that the influence of kinetic parameters becomes increasingly dominant at lower depletion rates, underscoring the importance of gas transition kinetics in controlling recovery performance under slow drawdown. At the lowest depletion rates examined, the kinetic parameters reach values that promote gradual gas release and extended retention of dispersed gas throughout most of the depletion process. Under these conditions, gas remains largely immobile despite ongoing pressure reduction, allowing the oil phase to retain dissolved and dispersed gas and maintain effective mobility. The delayed transition to free gas flow results in sustained oil production, suppressed GOR, and enhanced cumulative recovery. These observations are consistent with experimental sand-pack studies and provide a mechanistic explanation for the improved recovery efficiency associated with slow pressure depletion in heavy oil systems. The kinetic gas transition behavior also interacts with saturation-related parameters, particularly critical gas saturation. As the transition from dispersed to free gas is delayed, higher gas saturations are required before gas becomes mobile, effectively shifting the onset of two-phase flow. This interaction reinforces the nonequilibrium nature of foamy oil systems and highlights the limitations of conventional equilibrium-based flow models when applied to heavy oil reservoirs under solution gas drive. From a modeling perspective, the results demonstrate that kinetic gas transition parameters cannot be treated as fixed properties but must be adjusted in accordance with operating conditions, especially depletion rate. Failure to account for this dependence may lead to inaccurate predictions of gas breakthrough timing and recovery efficiency. From a reservoir development standpoint, the findings emphasize that managing pressure depletion rate provides indirect control over gas kinetics and foamy oil stability, offering a practical lever to enhance primary recovery in heavy oil reservoirs.
Surrogate model performance and predictive capability
The performance of the surrogate modeling framework was evaluated by comparing machine-learning-based predictions against the corresponding CMG-STARS simulation results across the full range of pressure depletion rates and parameter variations considered in this study. The surrogate models demonstrate strong predictive capability for key production metrics, including oil production rate, gas production rate, cumulative oil recovery, gas breakthrough time, and GOR evolution. Across all depletion scenarios, the predicted responses closely match the numerical simulation outputs, indicating that the surrogate models successfully capture the nonlinear relationships governing foamy-oil production behavior under solution gas drive. The gradient-boosted decision tree framework is particularly well suited for this application because it can represent complex interactions among operational parameters, relative permeability characteristics, and gas-transition kinetics. These interactions are difficult to capture using simpler regression-based surrogate models. The surrogate models achieved strong predictive performance, coefficients of determination (R2) exceeding 0.95 and RMSE remaining below 5% of the corresponding simulation outputs across the investigated depletion scenarios. These results confirm that the machine-learning framework accurately reproduces the response of the physics-based simulator while maintaining high predictive reliability across both rapid and slow depletion regimes. A key advantage of the surrogate framework is the substantial reduction in computational cost relative to full numerical simulation. While individual CMG-STARS simulations require significant computational time to resolve coupled multiphase flow and gas-transition kinetics, the trained surrogate models generate production forecasts almost instantaneously. This speed improvement enables rapid evaluation of multiple depletion strategies and parameter combinations, facilitating sensitivity analysis, scenario screening, and uncertainty exploration within the parameter ranges represented in the simulation dataset. Although the surrogate models generate deterministic predictions, the structured design-of-experiments dataset used for training spans a wide range of operational and flow parameters. As a result, the surrogate framework allows rapid exploration of production outcomes across the investigated parameter space. It should be noted, however, that the predictive capability of the surrogate model is primarily limited to the parameter ranges represented in the training dataset, and extrapolation beyond these bounds may require additional simulation-based training data. From a reservoir development perspective, the surrogate modeling capability provides a practical decision-support tool. The ability to rapidly predict production responses under varying depletion strategies allows engineers to evaluate trade-offs between early-time production rates and long-term recovery efficiency while maintaining consistency with physics-based simulation results. When combined with explainable machine-learning analysis, the surrogate framework also improves transparency by linking production predictions directly to the governing parameters controlling foamy-oil behavior. Future work could extend the framework by incorporating probabilistic sampling or Bayesian machine-learning techniques to explicitly quantify prediction uncertainty and generate confidence intervals for production forecasts.
Engineering implications for heavy oil reservoir development
The results of this study have direct implications for the development and management of heavy oil reservoirs operating under solution gas drive conditions. The strong dependence of foamy-oil behavior on pressure depletion rate indicates that drawdown strategy represents a primary control on recovery efficiency rather than merely an operational constraint. Slower pressure depletion promotes prolonged stability of dispersed gas within the oil phase, delays gas breakthrough, and suppresses gas mobility, resulting in sustained oil production and improved ultimate recovery. Conversely, aggressive drawdown strategies may deliver higher early-time oil rates but accelerate gas liberation and coalescence, shortening the foamy-oil regime and reducing recovery efficiency.
From a field development perspective, these findings suggest that optimal depletion strategies should balance early production objectives with long-term recovery considerations. In reservoirs where foamy-oil behavior is expected, controlled pressure depletion may yield higher overall recovery by maintaining favorable dispersed-gas flow conditions and delaying premature gas production. Drawdown limits commonly applied to mitigate sand production or wellbore instability may therefore also provide a secondary benefit by enhancing foamy-oil performance. The sensitivity analysis further indicates that kinetic gas transition parameters and critical gas saturation exert strong control over production behavior, particularly under slow depletion regimes. Although these parameters are not directly measurable at the field scale, they represent physical processes such as gas nucleation, dispersion, and coalescence that can be influenced indirectly through operational decisions and reservoir management strategies. The surrogate modeling framework developed in this study enhances engineering applicability by enabling rapid evaluation of alternative depletion scenarios without sacrificing physical consistency. This capability allows engineers to efficiently explore a wide range of operating conditions, assess parameter uncertainty, and identify robust development strategies while avoiding the computational cost associated with repeated full-physics simulations. It should be noted that the present study is based on laboratory-scale depletion experiments conducted in a controlled sand-pack system and does not attempt to directly extrapolate production forecasts to heterogeneous field-scale reservoirs. Instead, the objective is to demonstrate how simulation-informed surrogate modeling can accelerate evaluation of foamy-oil production behavior under controlled depletion conditions. For field applications, the proposed workflow would be integrated with calibrated reservoir simulation models representing field-scale geology, fluid properties, and well configurations. This process would involve: (a) construction of a calibrated full-field simulation model, (b) generation of simulation datasets representing operational uncertainties and depletion strategies, and (c) training of surrogate models using field-scale simulation outputs. Once trained, the surrogate models could provide rapid forecasts and sensitivity analysis across a wide range of development scenarios, supporting screening-level decision-making prior to detailed simulation studies.
Conclusions
This study presented a simulation-informed surrogate modeling framework that integrates physics-based numerical simulation with explainable machine-learning analysis to investigate foamy-oil production behavior under controlled pressure depletion conditions. Laboratory-scale pressure-depletion experiments conducted in a long sand-pack system were reproduced using a calibrated CMG-STARS model, enabling systematic evaluation of the influence of depletion rate, relative permeability characteristics, and kinetic gas-transition parameters on oil and gas production performance. The results demonstrate that pressure depletion rate represents the dominant operational control governing foamy-oil production behavior. Slower drawdown promotes prolonged stability of dispersed gas within the oil phase, delays gas breakthrough, suppresses GOR, and improves cumulative oil recovery. In contrast, rapid pressure depletion accelerates gas coalescence and shortens the persistence of the foamy-oil regime. These findings confirm that foamy-oil flow is strongly nonequilibrium in nature and cannot be adequately described using conventional solution-gas-drive assumptions. The analysis further shows that kinetic gas-transition parameters play a critical role in controlling foamy-oil stability. Reduced rates of solution-to-dispersed and dispersed-to-free gas conversion under slow depletion conditions favor extended gas retention within the oil phase and delayed gas mobility. Sensitivity analysis indicates that the relative influence of these kinetic parameters increases as depletion rate decreases, whereas relative permeability parameters exert comparatively secondary influence under the investigated conditions. The surrogate modeling framework demonstrates high predictive accuracy with coefficients of determination (R2) exceeding 0.95 while achieving substantial reductions in computational cost relative to full-physics simulation. When combined with explainable machine-learning analysis, the framework enables transparent identification of dominant parameters, reduces parameter nonuniqueness, and improves interpretability of production predictions. From a reservoir-engineering perspective, the results indicate that drawdown management represents a practical and effective operational lever for optimizing primary recovery in heavy-oil reservoirs exhibiting foamy-oil behavior. The integrated workflow therefore provides a computationally efficient decision-support tool for evaluating depletion strategies and translating laboratory-scale insights into development-relevant guidance. Despite the insights provided, several limitations should be acknowledged. The numerical simulations and surrogate models are based on laboratory-scale sand-pack depletion experiments and therefore do not directly represent heterogeneous field reservoirs. Application of the proposed methodology to field-scale systems would require integration with calibrated reservoir simulation models that incorporate geological heterogeneity, well configurations, and operational constraints. Future work should extend the simulation-informed surrogate modeling framework to full-field reservoir models and incorporate field production data for validation. Such developments would enable more robust prediction of foamy-oil production behavior and support uncertainty quantification in heavy-oil reservoir development planning.
Footnotes
Notation
| Symbol | Description | Units |
|---|---|---|
| GOR | Gas–oil ratio | scf/STB |
| Kr | Relative permeability | – |
| Sgc | Critical gas saturation | – |
| RF | Recovery factor | % |
| R2 | Coefficient of determination | – |
| RMSE | Root-mean-square error | – |
| DoE | Design of experiments | – |
Funding
This research has not received any specific grant from any public, commercial, or nonprofit funding body.
Declaration of conflicting interests
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
The datasets generated and analyzed during the current study are available from the corresponding author upon reasonable request. The numerical simulation models were developed using commercial reservoir simulation software (CMG-STARS) and therefore cannot be publicly distributed. However, the input parameters, processed datasets, and surrogate modeling outputs used in this study can be provided by the corresponding author for research purposes.
