Sage Journals: Discover world-class research

Abstract

We present an approach for modelling heterogeneous gas flow patterns in gas transmission networks based on mixtures of generalized nonlinear models (GNMs). This enables the use of a gamma distribution for the dependent variable and to incorporate standard loading profile specifications based on nonlinear functions where the specific form to be used is explicitly prescribed by a regulatory framework agreement. We focus in particular on the modelling of the maximum daily gas flow to predict gas flow at low temperatures where we take into account if the weekday is a working day or a non-working day. The application to a gas flow dataset from Western Austria exemplifies the benefits of using mixture models to obtain maximum gas flow predictions while taking day characteristics into account.

Keywords

EM algorithm finite mixture model generalized nonlinear model

1 Introduction

Maximum daily gas flow data typically exhibit a nonlinear decreasing consumption pattern subject to the outside temperature and a regulatory framework agreement explicitly prescribes the use of specific nonlinear regression functions as standard loading profiles for modelling these data. In addition, heterogeneity in these standard loading profiles is usually observable, for example, in relation to certain day-specific characteristics. With the maximum daily gas flow data being assumed to follow a gamma distribution, overall a mixture of generalized nonlinear models (GNMs) emerges as a suitable model class. A key aspect of the determination and balancing of gas flow is to predict the maximum daily amount of gas flow for low temperatures. For these temperatures, the data are typically sparse. In addition very low temperatures, which are even beyond the usual observation range for outside temperatures, are of particular interest and are referred to as design temperatures.

We propose mixtures of GNMs as a suitable and flexible model class in order to predict the required gas supply for design temperatures. Mixture models are a flexible model class where different component models can be easily included to adapt to specific data structures or modelling needs (e.g., Willemsen et al., 2017). Our proposed approach pre-specifies the nonlinear relationship in the components of the mixture in contrast to other mixture approaches where semi-parametric regression models are used for the components to capture nonlinearities (e.g., Meyer et al., 2024). Pursuing an approach where the nonlinear function is pre-specified is attractive due to the interpretability of the regression coefficients and hence can be found in different areas of applications.

We conjecture that one main driver for heterogeneity in gas consumption data is given by the information on working days and non-working days. Therefore, including this information in the model is crucial and we will outline and investigate different approaches for the incorporation of the indicator on working days and non-working days within the mixture model framework.

The natural nonlinear shape of gas flow motivates the use of a sigmoid regression function as basic framework for standard loading profiles. In addition, the specific form of these standard loading profiles is prescribed as part of the framework agreement for German operators of gas supply networks provided by the (BDEW, Bundesverband der Energie- und Wasserwirtschaft; see BDEW, 2021). Gas suppliers rely on suitable estimates of these synthetic standard loading profiles for the allocation of gas data as they enable the prediction of gas consumption in gas transmission networks. Detailed research on the functional structure is discussed by Hellwig (2003), BDEW (2021) and Koch et al. (2015). The application of the sigmoid function was also studied for the Austrian market by Almbauer (2008). In addition, Friedl et al. (2012) studied historical data of gas consumption for a gas supplier with the aim to improve gas supply and reduce operational costs using also this functional structure, providing explanations for its use. A crucial criterion for a suitable model was the reliable gas supply prediction of the maximum gas transportation for design temperatures. Following Friedl et al. (2012), we will concentrate on the modelling of the maximum daily gas flow based on daily mean temperatures and the information on working days and non-working days as this is of particular interest when assessing gas transportation capacity. We also employ the sigmoid function prescribed within the framework of the respective regulations (Friedl et al., 2012; BDEW, 2021).

Our study contributes in the following two ways to the existing modelling framework for gas flow considered in Friedl et al. (2012): (a) We consider the gamma distribution in order to model the daily maximum gas flow and (b) we incorporate heterogeneous patterns using the flexible mixture modelling framework. We address the inclusion of working days and non-working days as predictor variable within the sigmoid function and as concomitant variable in the model where we realize the full potential of mixture model-based clustering. We apply an efficient fitting algorithm for mixtures of several sigmoid models provided by the add-on software package flexmixNL for the R environment for statistical computing and graphics (see Omerovic, 2019a; Omerovic et al., 2022). This R package represents an extension of the well-known package flexmix (see Leisch, 2004; Grün and Leisch, 2007, 2008). Various other statistical applications make use of sophisticated mixture models building on flexmix (e.g., Grün and Hornik, 2012) and concomitant variable models (e.g., Larson and Dinse, 1985). We focus specifically on the prediction for design, that is, low, temperature values. We derive the standard errors for these predictions based on the observed information matrix because the underlying EM algorithm does not automatically provide information on the variability of the parameter estimates. Further, we illustrate the use of localized versions of information criteria such as AIC (Akaike Information Criterion) and BIC (Bayesian Information Criterion) for model selection to emphasize the focus on obtaining a good fit for design temperatures.

The article is organized as follows: The class of mixtures of GNMs proposed for flexible modelling of maximum daily gas flow building on the prescribed standard gas loading profile model is specified in Section 2. The underlying fitting algorithm is the EM algorithm which we briefly describe in Section 3. There we further outline the computation of standard errors and the derivation of confidence intervals for predicted mean values at design temperatures which are of crucial interest in this application. Section 4 presents the computational environment provided by the R package flexmixNL. Section 5 provides a simulation study to assess the performance of the proposed methodology on synthetic data. Section 6 discusses the application of the proposed mixture model class to gas flow data from Western Austria. Section 7 concludes.

2 Model specification

We consider finite mixtures of regression models consisting of K components. The response y denotes the standardized, that is, scaled by the overall empirical mean, maximum daily gas flow and is assumed to be drawn conditional on the predictor variables x and the concomitant variables w from the K-component mixture distribution given by

f^{M} (y; μ (x, β), ϕ, π (w, α)) = \sum_{k = 1}^{K} π_{k} (w, α) f (y; μ (x, β_{k}), ϕ_{k}) .

(2.1)

In this marginal specification the number of components K is considered to be fixed and the component-specific probability density function (PDF) f(y; μ(x, β_k), ϕ_k) is supposed to be a member of the linear exponential family that differs across components solely in the parameters β_k and ϕ_k. The component sizes π_k are not only assumed to be positive and sum to one, but also to depend on the concomitant variables w through a concomitant variable model.

To model the standardized daily maximum gas flow, we consider gamma components, that is,

f (y; μ (x, β_{k}), v_{k}) = \exp (- \frac{v_{k}}{μ (x, β_{k})} y) {(\frac{v_{k}}{μ (x, β_{k})})}^{v_{k}} y^{v_{k} - 1} \frac{1}{Γ (v_{k})},

(2.2)

where the shape parameter corresponds to the dispersion parameter through ν_k = 1/ϕ_k and a nonlinear mean function μ(x, β_k) > 0 is assumed for the response y > 0. This framework for the components corresponds to the model class of generalized nonlinear models (GNM; Wei, 1998; Turner and Firth, 2018). Gamma components are a natural choice for modelling standardized daily maximum gas flow because they naturally take into account that the dependent variable is positive and has a higher variance for higher mean values resulting in a parsimonious specification capturing this characteristic. For the gamma distribution the variance in the kth component equals μ(x, β_k)²/ν_k.

The mean μ(x, β_k) depends on predictor variables x with component-specific parameters β_k and follows the standard gas loading profile (BDEW, 2021). This profile has an underlying sigmoid functional structure which induces a decreasing consumption behaviour for increasing outside temperature. Following BDEW (2021), we take into account that the buildings’ heat accumulation capacity contributes to the daily gas consumption and use a weighted 4-day mean temperature as predictor variable instead of the respective single daily mean temperature. This weighted 4-day mean is determined with weights $\tilde{ω}$ = (8, 4, 2, 1)/15, that is,

t = \sum_{j = 0}^{3} {\tilde{ω}}_{j} {\tilde{t}}_{- j} .

(2.3)

The variables ${({\tilde{t}}_{- j},)}_{j = 0, \dots, 3}$ correspond to the daily mean temperature values on the current date and the three previous days. Note that using this weighting scheme to obtain the predictor variable is also prescribed by the regulatory framework agreement. The sigmoid function is specified by the regression coefficient β = (β₁, β₂, β₃, β₄) and the relationship

μ (x, β) = β_{4} + \frac{β_{1} - β_{4}}{1 + {(\frac{β_{2}}{t - 40^{\circ}})}^{β_{3}}},

(2.4)

with predictor variable x = t.

The regression coefficients have the following physical meaning: Coefficients β₁ and β₄ describe upper and lower horizontal asymptotes of the sigmoid curve with β₁ > β₄, while β₂ and β₃ affect the shape and decrease of the curve with increasing temperature values. According to the energy industry, these parameters can be interpreted in the following way: The lower bound β₄ incorporates a constant share of energy (warm water supply or share energy). The difference β₁ − β₄ indicates the decrease in gas consumption from cold to warm days. The coefficient β₂ measures the change in gas consumption due to cold periods, while β₃ refers to the dependence on the heating period. The temperature t is shifted by 40^◦ to avoid discontinuities within the temperature range (Hellwig, 2003, p. 39).

As the prediction of the final gas flow is supposed to differentiate between working days and non-working days (see BDEW, 2021), we also include the indicator variable d in our model in order to denote whether or not the observation relates to a working day or a non-working day, respectively:

d = \{\begin{matrix} 1, & working day \\ 0, & non-working day \end{matrix}

Non-working days refer in general to Saturdays, Sundays and public holidays.

The information on working days may be included in the model in different ways. The nonlinear regression function can be extended to include this variable by adding an additional parameter for working day and inserting this parameter in a specific way to alter the nonlinear function. Alternatively, this variable could also be seen as the main driver of heterogeneity and hence included to model the prior probabilities, that is, have the component weights vary with this indicator.

The first approach was considered in Friedl et al. (2012) in combination with a nonlinear regression model with normally distributed errors. They suggested to include this daily characteristic into the scope of the sigmoid regression function (2.4) by extending the functional form. They considered in particular the following specification:

μ (x, β) = β_{4} + \frac{β_{1} - β_{4}}{1 + {(\frac{β_{2}}{t - 40^{\circ}} + β_{5} \cdot d)}^{β_{3}}},

(2.5)

where the vector of regression coefficients is now given by β := (β₁, β₂, β₃, β₄, β₅) and the predictor variables consist of the temperature t and the indicator d, that is, x = (t, d). Model (2.5) demonstrates that the additional information on the weekday influences the shape of the sigmoid curve rather than the constant share of energy β₄ or the upper bound for daily gas consumption β₁. This specification thus does not allow to capture a pattern where the gas consumption decreases on non-working days such that possibly also lower constant levels of energy shares are obtained. This would require varying regression coefficients β₁ and β₄.

As an alternative, we thus consider including the information on working days and non-working days as a concomitant variable within the mixture model framework allowing for different groups with varying regression parameters. The component weights may depend on α relating to the concomitant variables w where $\sum_{k} π_{k} (w, α) = 1$ and $0 < π_{k} (w, α), \forall k$ , holds. We parametrize the prior probabilities through a multinomial logit model (see Grün and Leisch, 2008; Young and Hunter, 2010), that is,

π_{k} (w, α) = \frac{e^{w^{⊤} α_{k}}}{\sum_{l = 1}^{K} e^{w^{⊤} α_{l}}} .

(2.6)

Model (2.1) is fully specified through the component weights, the component-specific mean functions and the dispersion parameters. In the following, we define the overall set of unknown parameters as Ψ = (π₁, …, π_K₋₁, β₁, …, β_K, ϕ₁, …, ϕ_K) for mixtures without concomitant variables. For mixture models with concomitant variables, we replace π₁, …, π_K₋₁ with the parameter vector α. The marginal mean for given x and w is derived from (2.1) as

μ^{M} (x, w, Ψ) = \sum_{k = 1}^{K} π_{k} (w, α) μ (x, β_{k}) .

(2.7)

The main focus of the subsequent analysis lies in the estimation of the parameter vector Ψ and the prediction of the expected daily maximum gas flow for low temperature values while differentiating between working days and non-working days.

3 Parameter estimation

3.1 EM algorithm

For a sample of n observations the K-component mixture distribution in (2.1) yields the log likelihood function

l (Ψ; y, x, w) = \sum_{i = 1}^{n} \log \{\sum_{k = 1}^{K} π_{k} (w_{i}, α) f (y_{i}; μ (x_{i}, β_{k}), ϕ_{k})\}

(3.1)

with y = (y_i) _i _=1, _… _,_n, x = (x_i) _i _=1, _… _, _n and w = (w_i) _i _=1, _… _, _n . For the sake of convenience, we write f_ik = f(y_i; μ(x_i, β_k), ϕ_k) for the PDF of the ith observation from the kth component. For mixture models, we assume the existence of heterogeneous discrete structures indicated by hidden or latent variables. Considering the latent variable z = (z_ik) _i _=1, _… _, _n _, _k _=1, _… _, _K , we derive the complete-data log likelihood

l^{c} (Ψ; y, x, w, z) = \sum_{i = 1}^{n} \sum_{k = 1}^{K} z_{i k} \{\log π_{k} (w_{i}, α) + \log f_{i k}\},

(3.2)

where z consists of the component labels z_ik ∈ {0, 1} indicating the membership of the ith observation to the kth component. For mixture models the standard numerical approach to maximum likelihood estimation is based on maximizing the expected complete-data log likelihood function (3.2) through the EM algorithm. Given initial values, the EM algorithm constitutes an iterative two-step procedure executing an expectation- and a maximization-step. As the complete-data log likelihood ℓ^c(Ψ; y, x, w, z) depends on unknown information, we consider its conditional expectation given the observed data y, x and w and the current parameter estimate Ψ⁽ ^j ⁾ in the jth iteration. Thus the E-step results in the objective function given by

Q (Ψ; Ψ^{(j)}) = \sum_{i = 1}^{n} \sum_{k = 1}^{K} p_{i k}^{(j)} \{\log π_{k} (w_{i}, α) + \log f_{i k}\},

where the posterior weights $p_{i k}^{(j)}$ are fixed in the subsequent maximization, that is,

p_{i k}^{(j)} = E (z_{i k} | y_{i}, x_{i}, w_{i}, Ψ^{(j)}) = \frac{π_{k} (w_{i}, α^{(j)}) f_{i k}^{(j)}}{\sum_{l = 1}^{K} π_{l} (w_{i}, α^{(j)}) f_{i l}^{(j)}} .

The updated parameter value Ψ⁽ ^j ⁺¹⁾ maximizes Q(Ψ; Ψ⁽ ^j ⁾) in Ψ for given Ψ⁽ ^j ⁾. In the case of mixtures of GNMs the mean function is given by a prespecified nonlinear regression function, as for example the sigmoid mean function in model (2.4) or (2.5) and the M-step requires an additional iteration loop in order to update the nonlinear regression parameters. As with one-component GNMs, convergence of the methods depends on a proper parameterization of the nonlinear mean function and the adequate choice of starting values. See Wei (1998) and Omerovic (2019a) for further details. We suggest a suitable initialization scheme for modelling standardized maximum daily gas flow in Section 5 where we also assess performance in a simulation study and subsequently make use of this initialization scheme in Section 6.

The two steps of the EM algorithm are iteratively repeated until a suitable stopping criterion is fulfilled. At convergence to a global maximum, $\hat{Ψ} = Ψ^{(\infty)}$ is the maximum likelihood estimate of all parameters in the system. The final component allocation of the data is determined through the maximum posterior weight, that is, max _k ${\hat{p}}_{i k}, i = 1, \dots, n$ .

3.2 Inference

The EM algorithm does not automatically provide standard errors of the maximum likelihood estimator $\hat{Ψ}$ as a by-product. We derive standard errors based on the asymptotic variance-covariance matrix of $\hat{Ψ}$ , which is approximated by the inverse of the Fisher information. This implies that $Var [\hat{Ψ}] \approx I {(Ψ)}^{- 1}$ with $I (Ψ) = - E (\partial^{2} l (Ψ; y, x, w) / \partial Ψ \partial Ψ^{⊤})$ (McLachlan and Peel, 2000). The Fisher information can be consistently estimated by the observed information, that is,

I (\hat{Ψ}; y, x, w) = - {\frac{\partial^{2} l (Ψ; y, x, w)}{\partial Ψ \partial Ψ^{⊤}}|}_{Ψ = \hat{Ψ}},

where $I (\hat{Ψ}; y, x, w)$ is the negative Hessian of the original log likelihood (3.1) evaluated at $\hat{Ψ}$ . The standard error of the rth component in $\hat{Ψ}$ is determined by

S E ({\hat{Ψ}}_{r}) = \sqrt{I {(\hat{Ψ})}_{r r}^{- 1}} .

(3.3)

O’Hagan et al. (2019) investigate the derivation of standard errors for Gaussian mixture models and point out that determining standard errors based on the Fisher information might give poor results for small sample sizes and unbalanced mixture component sizes. Similar findings are also reported recently in Griesbach and Hepp (2023). Both these contributions suggest to use suitable resampling methods to obtain standard error estimates. We investigate the suitability of the standard errors obtained using the Fisher information in the simulation study in Section 5 and find that their performance is satisfactory for the data scenarios considered which are similar to the setting in the empirical application. We thus conclude that resampling methods are not required in the empirical application to obtain reliable standard errors.

3.3 Predicting for low temperatures

The use of mixtures of GNMs enables the prediction of gas flow while accounting for day characteristics in a flexible way. We use the mean of the maximum daily gas flow as determined conditional on x and w from the K-component mixture model (2.7). To also allow for the assessment of uncertainty associated with these point estimates, we construct confidence intervals for the predicted mean values.

The variance of the mean estimator μ^M(x, w, $\hat{Ψ}$ ) is approximated based on the delta method using a quadratic form of its gradient and the variance matrix of the MLE $\hat{Ψ}$ , that is,

Var [μ^{M} (x, w, \hat{Ψ})] \approx \nabla {(μ^{M} (x, w, \hat{Ψ}))}^{⊤} Var [\hat{Ψ}] \nabla (μ^{M} (x, w, \hat{Ψ})),

(3.4)

where the gradient ∇(μ^M(x, w, $\hat{Ψ}$ )) containing the derivatives is given by

\nabla (μ^{M} (x, w, \hat{Ψ})) = {(\frac{\partial μ^{M} (x, w, Ψ)}{\partial ψ})}_{ψ \in Ψ, Ψ = \hat{Ψ}} .

(3.5)

The gradient differs depending on the underlying mixture model (i.e., if constant component weights or a concomitant variable model are used) and on the number of unknown parameters in the nonlinear regression function. The standard error is determined by taking the square root, that is, $SE (μ^{M} (x, w, \hat{Ψ})) = \sqrt{Var [μ^{M} (x, w, \hat{Ψ})]}$ . The corresponding confidence interval for μ^M(x, w, Ψ) at the confidence level (1 − γ) is derived as

(μ^{M} (x, w, \hat{Ψ}) \pm z_{1 - γ / 2} \cdot S E (μ^{M} (x, w, \hat{Ψ}))),

(3.6)

where z_γ denotes the γ quantile of the standard normal distribution. The mean (2.7) and the interval (3.6) will be applied in the subsequent applications in order to predict the maximum daily gas flow for low temperatures and to assess the uncertainty associated with the prediction using confidence intervals.

4 Computational environment: flexmixNL

The methods for estimation and inference are implemented in the R package flexmixNL (Omerovic, 2019b) which extends package flexmix (Leisch, 2004; Grün and Leisch, 2007, 2008). Package flexmix serves as the main base for the implementation by providing a generic infrastructure for the EM algorithm when fitting mixture models, exploiting that only slight adaptations of the EM algorithm need to be selectively implemented depending on the component-specific model used. Package flexmixNL extends these functionalities to be able to appropriately model mixtures of GNMs. It allows for the fitting of mixtures of GNMs with normal and gamma component distributions. Due to the nonlinearity of the mean functions, flexmixNL incorporates two crucial advancements: As the formula function may incorporate arbitrary nonlinear terms, flexmixNL takes this into account through the use of a symbolic language in the formula objects. Furthermore, the numerical procedures for the fitting of nonlinear functions afford the additional information on specific starting values which have to be provided for each component.

Technically, flexmixNL uses the functions nls() (for the normal distribution) and gnm() (for the gamma distribution) for fitting the component-specific parameters in the M-step. The latter belongs to package gnm introduced by Turner and Firth (2007). These fitting procedures have in common that they require appropriate starting values for all unknown regression coefficients in order to achieve convergent results. The starting values are provided through a list object when calling the fitting procedures. For a detailed explanation of the functionality and the technical architecture of flexmixNL see Omerovic (2019a) and Omerovic et al. (2022).

Predictions of mean values are derived through the evaluation of the marginal mean for the fitted mixture model at a specific predictor value, for example, a specific temperature value and day characteristic. Standard errors for $\hat{Ψ}$ are determined based on the observed information (negative Hessian matrix) of the original log likelihood function of the mixture model (3.1). We apply the function hessian() from package numDeriv (Gilbert and Varadhan, 2016) in order to numerically derive the Hessian matrix. The standard errors are determined according to (3.3) by inverting the negative Hessian matrix and component-wise root extraction of the diagonal elements. The standard error of the mean estimator is computed according to (3.4) using the computed standard errors for $\hat{Ψ}$ and the gradient (3.5). We derive the gradient (3.5) from (2.7) through symbolic differentiation with respect to Ψ.

5 Simulation study for two-component mixtures

We analyze the performance of the fitting algorithm with a simulation study regarding parameter estimates and standard errors. The considered models are mixtures of GNMs consisting of two components with a gamma distributed dependent variable, the sigmoid mean function (2.4) and including a concomitant variable model. A key aspect of the simulation study is to assess how well the original configuration is reproduced and to analyze the discrepancy of the estimated parameters to the true values of the data generating process. We vary in particular the sample sizes to obtain insights on minimum required sample sizes to derive acceptable parameter estimates. Emphasis is also placed on performance assessment of the mean predictions, in particular for design temperatures. Our goal is to confirm that estimates with acceptable precision are delivered in order to substantiate the further modelling of the data from Western Austria.

5.1 Setup

The simulation study is based on a two-component gamma mixture model, that is,

f^{M} (y; μ (x, β), ϕ, π (w, α)) = π_{1} (w, α) f (y; μ (x, β_{1}), v_{1}) + π_{2} (w, α) f (y; μ (x, β_{2}), v_{2}) .

(5.1)

The component PDFs follow a gamma distribution (2.2) with shape parameter ν_k. The mean function corresponds to the sigmoid function (2.4) with temperature t as the sole predictor variable x. The indicator d for working days and non-working days is considered as concomitant variable w in model (2.6) for the prior probabilities. The regression coefficient vectors are given by β_k = (β_k₁, β_k₂, β_k₃, β_k₄) for k = 1, 2. The specification of two components is inspired by the following scenario assumed for gas consumption: The first mixture component reflects the gas flow on days with consumption of industrial facilities in combination with households and the second component represents consumption at a lower level generated on days with essentially only private household consumption. The first component thus exhibits a higher consumption rate and has a greater component size. Due to the mean dependence of the variance for the gamma distribution, the variability increases with higher mean values yielding higher variability for the component including also the industrial consumption as the dispersion parameter is specified to be the same for both components.

Table 1 provides the complete parameter specification of the data generation process for model (5.1). These parameters induce that the component weight of the larger component of the working days and the non-working days is set to be 0.93 and 0.83, respectively. This is in line with the assumption that the prior probabilities $π_{k}^{0}$ and $π_{k}^{1}$ for w ∈ {0, 1} implied by the concomitant variable w induce a dominant component and that this component is different between working and non-working days.

The differences between the component-specific values of the parameters β_k₁ and β_k₄ induce a gap between the upper and lower asymptotes thus enabling the modelling of two different consumption levels by means of a mixture distribution. Parameter β_k₃ has a significant influence on the decrease of the sigmoid mean function (2.4) for k = 1, 2. A central assumption in the construction of the synthetic dataset is that the parameters specified induce a mixture where the average maximum daily gas flow for one component is higher than for the other component across all temperature values. Such a setting also implies that the nonlinear mean functions are not supposed to cross. Therefore, the parameter specification in Table 1 induces a sharper decrease in consumption for the lower component. The parameters in Table 1 have been selected such that they are comparable to real world data (see Section 6). The predictor variables are drawn within the interval $ℝ^{temp}$ = [−10^◦, 20^◦] from the empirical distribution of the real world temperatures given in Section 6. Further details on the simulation setup and the calculation of the performance measures are given in Sections A.1 and A.2 in the supplementary material. The ranges for the starting configurations used to randomly initialize the EM algorithm are provided in Table 2.

Table 1

Parameter specification of the two-component gamma mixture model with sigmoid mean function (2.4).

Component	$π_{k}$	$π_{k}^{1}$	$π_{k}^{0}$	$β_{k 1}$	$β_{k 2}$	$β_{k 3}$	$β_{k 4}$	$v_{k}$
k = 1	0.60	0.93	0.17	1.76	−35.00	10.00	0.97	50.00
k = 2	0.40	0.07	0.83	1.03	−35.00	20.00	0.24	50.00

Table 2

Ranges for the starting configuration of the gamma mixture models.

Coefficient	$β_{k 1}$	$β_{k 2}$	$β_{k 3}$	$β_{k 4}$	$β_{k 5}$
Range	[1.0, 2.2]	[−36, −30]	[5, 25]	[0.1, 1.0]	[−0.5, 0.5]

5.2 Mixture model parameters

We summarize the results to assess the performance of mixture model parameter estimation using MC means, the bias (BIAS), the standard deviation (SD), the asymptotic standard errors (ASE) and the coverage rates (CR) at the confidence levels 95% and 99% for the concomitant variable model and the components of the two-component gamma mixture model (see Tables A.1, A.2 and A.3 in the supplementary material).

Overall the MC means $\bar{\hat{ψ}}$ are almost unbiased and exhibit small SD. The estimated values are close to the true values (given in Table 1) for all three sample sizes. The comparison of the empirical SD( $\hat{ψ}$ ) with the mean ASE( $\hat{ψ}$ ) provides insights into the quality of the computed standard errors. Results indicate that the values agree in general for all parameter estimates $\hat{ψ} \in \hat{Ψ}$ . The CR for the levels 95% and 99% are also convincingly close to the nominal confidence levels.

5.3 Predicting expected maximum gas flow values

We compute the MC means μ^M(x_i, w_i, $\hat{ψ}$ ) based on the two-component gamma mixture model for selected temperature values t_i ∈ {−15, −10, −5, 0, 5, 10, 15, 20, 25}. In addition we compare the MC SD(μ^M(x_i, w_i, $\hat{ψ}$ )) with the mean ASE(μ^M(x_i, w_i, $\hat{ψ}$ )) over all simulations. Figure 1 visualizes the confidence intervals based on both values and shows a consistent pattern of the MC SD and the mean ASE for the selected temperature values and sample sizes n ∈ {400, 1000, 2000}.

The SD are very close to the ASE and the confidence intervals show high congruence in general for the selected temperature values. The slightly increasing ranges for 15^◦ and 25^◦ are due to less data. Besides that, the confidence intervals of the mean values exhibit even wider ranges for low temperature values at −15^◦ which coincides with the stronger variation of gas flow for low temperatures induced by the higher mean values.

6 Application to gas flow data in Western Austria

6.1 Data

The dataset combines information from three different sources to obtain data on gas flow, the daily mean temperatures and the type of day, that is, the working day indicators. Gas flow information is obtained for the market area Western Austria (Tyrol and Vorarlberg in Austria). The data are provided on the website of the Austrian balance group coordinators AGCS Gas Clearing and Settlement AG and can be accessed through www.energymonitor.at (AGCS Gas Clearing and Settlement AG, 2021). AGCS provides gas flow data for Western and Eastern Austria on an hourly basis. We focus on the data covering the Western part collected between January 2013 and December 2020. This time window corresponds to 2922 observation days. For each observation day, the maximum daily gas flow is determined using the hourly observations. Similar to Friedl et al. (2012, p. 26) we study the standardized daily maximum gas flows (in the following: std. maximum daily gas flow) obtained by normalizing with the mean of all observations.

Figure 1

MC mean values evaluated at temperatures x_i ∈ {−15, −10, −5, 0, 5, 10, 20, 25} with 95% confidence intervals based on standard deviations (SD) and asymptotic standard errors (ASE).

We derive the daily mean temperatures in degrees Celsius (^◦C) from the Central Institution for Meteorology and Geodynamics (ZAMG, Zentralanstalt für Meteorologie und Geodynamik, recently renamed GeoSphere Austria) in Austria. We consider temperature values from a measuring station in Innsbruck (Austria, Tyrol) representative for the climate region of the gas distribution point for the gas flow data. The data can be accessed on the web page of ZAMG (2022). For the subsequent analysis we compute the weighted 4-day mean temperatures according to (2.3).

Figure 2

Maximum daily gas flow as scatter plot (left) and boxplot with binned temperature values (right) with distinction between non-working days (black) and working days (red).

We deduce the information on working days and non-working days based on the date stamp with the R package timeDate (Wuertz, 2018). We specify Saturday and Sunday to be non-working days and further add the information on national public holidays in Austria to also classify these days as non-working days. The remaining days are considered as working days.

The gas flow data is visualized in Figure 2. Working and non-working days are indicated by colour. Clearly there is a nonlinear relationship between temperature and gas flow visible with higher values for low temperatures, but a levelling off for high temperatures. The gas flow values are in general also higher for working than for non-working days regardless of the temperature. The data range considered implies that also observations from the COVID-19 pandemic are contained where official (mobility) restrictions were imposed. An exploratory analysis did not indicate an impact of COVID-2019 measures on the daily maximum gas flow, suggesting no need to account for COVID-19 lockdown periods. This analysis, however, provided strong evidence that warm temperatures during the winter months have an impact on the daily maximum gas flow re-confirming the need to include daily mean temperature as explanatory variable in the model.

6.2 Mixture model specifications

In the subsequent analysis, we present the results for the two different possible model configurations to include the indicator d for working days and non-working days in the model. We either include this indicator with a regression coefficient in the nonlinear regression function or as a concomitant variable. Table 3 explains in detail how these specifications differ.

We first focus on comparing the one-component models resulting from these model configurations with the two-component mixture model extensions. The inclusion of two components is of particular interest as for the concomitant variable model this allows to assess congruence between the latent groups and the manifest working day indicator variable. The parameter estimates as well as their standard errors under all these models are reported and compared. In addition, for the concomitant variable model, the conditional prior probabilities for each component are determined in dependence of the working day indicator. The model fit is assessed and compared using different information criteria (AIC, BIC). We then also investigate the results when fitting mixtures with additional components. Finally, predictions of the mean values for low temperatures are obtained and compared for the selected models.

Table 3

Model configurations of including the indicator d_i for working days and non-working days.

Model	Description
Model 1:	Sigmoid mean function (2.5) with temperature and working day indicator as predictor variable x_i = (t_i, d_i) and constant prior probabilities.
Model 2:	Sigmoid mean function (2.4) with temperature x_i = t_i as predictor variable and working day indicator w_i = d_i as concomitant variable in (2.6) for the prior probabilities.

6.3 Results

Models 1 and 2 with K = 1 and 2 were fitted with a set of 20 starting configurations with the starting values uniformly drawn from the ranges in Table 2. The ranges for the possible starting values are the same as used in the simulation study in Section 5, except that for this application now also an interval for β_k₅ is included. The length of the intervals are generously chosen to allow for identification of all potential solutions of the algorithm and select the best fitting one. Also these intervals have been shown to lead to reliable results in the simulation study. The results obtained are provided in Table 4.

Figure 3 outlines the fitted regression functions for the one-component models, that is, where K = 1. Model 1 delivers results differentiating between working days (blue dashed line) and non-working days (blue solid line), whereas Model 2 results in a single sigmoid regression function (coloured in red). Model 1 indicates that the gas flow varies in its shape depending on if a day is a working or a non-working day. The gas flow declines faster on non-working days while it remains longer on a higher level on working days. Both functions converge to the approximately same constant gas flow level for temperatures beyond 16^◦. Including an indicator for working days and non-working days influences the shape of the sigmoid regression curve while the asymptotes, reflected by the parameter estimates βˆ1 and βˆ4, remain almost unchanged for both models. Clearly both one-component models fail to provide different fitted values for working days compared to non-working days when considering high or low temperatures.

Figure 4 illustrates the fitted expected maximum daily gas flow functions of the two-component mixture models based on the marginal mean functions predicted for working days and non-working days. In Model 1 the working day indicator is included as a regression parameter. The marginal means are thus determined by summing over the two components using their prior weights π_k and the component specific mean functions conditional on the day being a working day or a non-working day.

Table 4

Parameter estimates with standard errors below.

	Model 1	K = 2		Model 2	K = 2
	K = 1	k = 1	k = 2	K = 1	k = 1	k = 2		Conc. var.
${\hat{π}}_{k}$	—	0.658	0.355	—	0.655	0.344	${\hat{α}}_{1}$	1.933
	—	0.022	—	—	—	—		0.244
${\hat{β}}_{k 1}$	1.852	1.909	2.278	1.813	1.939	1.646	${\hat{α}}_{2}$	−3.739
	0.039	0.049	0.289	0.042	0.056	0.057		0.329
${\hat{β}}_{k 2}$	−35.802	−35.664	−41.593	−32.607	−32.946	−32.674	${\hat{π}}_{1}^{1}$	0.859
	0.393	0.451	3.436	0.343	0.451	0.400		0.057
${\hat{β}}_{k 3}$	6.957	6.775	4.483	8.091	7.855	8.198	${\hat{π}}_{2}^{1}$	0.141
	0.280	0.286	0.630	0.352	0.423	0.425		0.057
${\hat{β}}_{k 4}$	0.523	0.416	0.674	0.540	0.628	0.405	${\hat{π}}_{1}^{0}$	0.126
	0.006	0.007	0.018	0.006	0.017	0.008		0.027
${\hat{β}}_{k 5}$	−0.134	−0.121	−0.159	—	—	—	${\hat{π}}_{2}^{0}$	0.874
	0.007	0.006	0.022	—	—	—		0.027
${\hat{v}}_{k}$	24.804	57.071	48.560	21.344	33.381	56.836
	0.645	3.148	4.179	0.555	2.557	3.854
AIC	−1671.181	−2454.120		−1231.681	−2254.816
BIC	−1635.309	−2376.398		−1201.788	−2183.072

Figure 3

Fitted one-component models for Model 1 (non-working days shown as solid blue line, working days shown as dashed blue line) and Model 2 (red solid line).

Including a concomitant variable model in Model 2 implies that the prior weights of the components vary between working days (w = 1) and non-working days (w = 0) based on (2.6). The coefficient of the concomitant variable model is reported in Table 4 and can be used to compute the expected gas flow based on the fitted component means and depending on if a day is a working day or a non-working day. On working days, the first (i.e., the upper) component accounts for about 85.9% ( ${\hat{π}}_{1}^{1}$ ) of the days. The second component accounts for 87.4% of the days on non-working days ( ${\hat{π}}_{2}^{0}$ ).

Figure 4

Fitted marginal means (left Model 1, right Model 2) and observations coloured according to working days (red) and non-working days (blue).

Both two-component models allow for a distinctly different fit between consumption on working days and non-working days. The two models differ, however, in particular in the fitted values of very low and very high temperatures. The marginal mean functions estimated under Model 1 suggest an intersection for low temperature values less than −15^◦, as evident in Figure 4. By contrast, Model 2 gives two shifted mean functions allowing for a clear distinction in gas flow for the observed temperature values on working days and non-working days. This result supports the assumption of two distinct consumption levels for the whole temperature range depending on the day being a working day.

Using mixture models as modelling class allows fitting even more complex models by including additional components. We fit models up to the number of components K = 5 to the gas flow dataset and compare them through the model selection criteria based on AIC and BIC. AIC and BIC represent both global measures of goodness-of-fit. As the focus of our analysis lies on the balancing and prediction of gas flow for low temperatures (design temperatures), we use localized versions of the AIC and BIC in our model comparison which focus on obtaining a good fit for low temperatures. We determine the localized version of the AIC based on an increasing window along the temperature axis in order to assess the effect of considering an expanding range of temperatures on the model choice, that is,

A I C_{l o c} (t_{0}, t_{1}) = - 2 l (\hat{Ψ}; \tilde{y}, \tilde{x}, \tilde{w}) + 2 p,

with $\tilde{I} = \{i = 1, \dots, n ∣ x_{i} \in [t_{0}, t_{1}]\}$ and $\tilde{y} = {(y_{i})}_{i \in \tilde{I}}, \tilde{x} = {(x_{i})}_{i \in \tilde{I}}$ and $\tilde{w} = {(w_{i})}_{i \in \tilde{I}}$ and p the number of estimated parameters. The resulting AIC values for t₀ = −∞ and varying values for t₁ are shown in Figure 5 for Model 1 and in Figure 6 for Model 2.

The global AIC would suggest that the four- and five-component models result in the smallest, similar values. Focusing, however, on the localized version suggests that for temperatures below −5^◦ the AIC selects the one-component configuration for Model 1. Regarding Model 2, the localized AIC prefers the two-component mixture model for temperatures below −5^◦, given that we discard the one-component model from the consideration set as it does not allow to distinguish between working days and non-working days in prediction. The localized BIC results are comparable and would imply a similar model choice. Results are hence not shown. Clearly more parsimonious models are selected in case focus is only on the performance at low values compared to when the global performance is taken into account.

Figure 5

Localized AIC values for Model 1 and number of components between 1 and 5 (left entire temperature scale, right temperatures below 5◦).

Figure 6

Localized AIC values for Model 2 and number of components between 2 and 5 (left entire temperature scale, right temperatures below 5◦).

Final, we compute the predicted means of gas flow for temperature values ranging between −12^◦ and −16^◦. We restrict the models of interest to those preferred by the localized versions of AIC and BIC for low temperature values and which allow for a differentiation between working and non-working days: Model 1 with one component and Model 2 with two components. Table 5 displays the predicted mean values and their standard errors for the two models. We observe a stronger contrast between the predicted mean values for working days and non-working days under Model 2 compared to Model 1. This implies that Model 2 induces a stronger differentiation between working and non-working days, potentially leading to an improved estimate of the mean maximum daily gas flow for design temperatures.

Table 5

Predicted means for Model 1 with one component and Model 2 with two components with standard errors below.

		−12◦	−14◦	−16◦
K = 1	Model 1 (d = 1, working day)	1.830	1.836	1.840
		0.034	0.036	0.036
	Model 1 (d = 0, non-working day)	1.760	1.780	1.795
		0.025	0.027	0.029
K = 2	Model 2 (w = 1, working day)	1.863	1.872	1.878
		0.036	0.038	0.040
	Model 2 (w = 0, non-working day)	1.655	1.662	1.667
		0.037	0.039	0.040

7 Conclusions

In this article, we presented a suitable approach for modelling the maximum daily gas flow depending on daily temperature values and an indicator for working days and non-working days. The presented approach is in particular suitable when dealing with heterogeneous data patterns which might occur due to different consumption levels entailed for example by working days and non-working days. The application of mixture models shows an improvement when predicting mean gas flow for low temperatures as it allows to differentiate between distinct components. We presented two approaches for the inclusion of such an indicator variable within the mixture model framework. Furthermore, we considered generalized nonlinear regression models for the mixture components using a gamma distribution for the dependent variable. The gamma distribution is particularly suited to model maximum daily gas flow because of the interdependence of the variability with the mean values. We validated the underlying estimation algorithm by means of a simulation study where we delivered parameter estimates and standard errors of acceptable precision. The new approach was applied to a real world data set for the Austrian market where we succeeded to identify several components based on implicit information in the data.

In our application we focused on a setting where the gas flow only for a single station was modelled in compliance with a regulatory agreement framework. Extensions could be considered for settings where several stations are included. In this case, a nonlinear mixed-effects approach (Pinheiro and Bates, 1995) could be employed to account for heterogeneity across stations and the inclusion of spatial information considered. In case modelling approaches not in line with the regulatory agreement framework are also to be considered, the use of generalized additive models such as GAMLSS (Rigby and Stasinopoulos, 2005) might seem appealing in order to account for the nonlinear shape in a data-driven way.

Footnotes

Acknowledgements

The authors would like to thank the anonymous referee and the associate editor for their helpful comments that improved the quality of the manuscript.

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding

The authors received no financial support for the research, authorship and/or publication of this article.

Supplemental material

References

AGCS Gas Clearing and Settlement AG (2021). Loading profiles, data licensed under CC BY 3.0 . https://www.energymonitor.at/en/gas

Almbauer

(2008) Lastprofile nicht-leistungs-emessener Kunden . Graz University of Technology and Association of Gas- and District Heating Supply Companies (FGW).

BDEW (2021) Leitfaden Abwicklung von Standard-lastprofile Gas . Bundesverband der Energie und Wasserwirtschaft e. V. URL https://www.bdew.de/service/standardvertraege/kooperationsvereinbarung-gas/

Friedl

, Mirkov

, and Steinkamp

(2012) Modelling and forecasting gas flow on exits of gas transmission networks. International Statistical Review , 80, 24–39. https://doi.org/10.1111/j.1751-5823.2011.00171.x

Gilbert

and Varadhan

(2016) numDeriv: Accurate Numerical Derivatives . R package version 2016.8-1.

Griesbach

and Hepp

(2023) Confidence intervals for finite mixture regression based on resampling techniques. In Proceedings of the 37th International Workshop on Statistical Modelling (IWSM), Dortmund, Germany, 16–21 July 2023 , pages 293–97.

Grün

and Leisch

(2007) Fitting finite mixtures of generalized linear regressions in R. Computational Statistics & and Data Analysis , 51, 5247–52. URL https://doi.org/10.1016/j.csda.2006.08.014

Grün

and Leisch

(2008) FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. Journal of Statistical Software , 28, 1–35. URL https://doi.org/10.18637/jss.v028.i04

Grün

and Hornik

(2012) Modelling human immunodeficiency virus ribonucleic acid levels with finite mixtures for censored longitudinal data. Journal of the Royal Statistical Society. Series C (Applied Statistics) , 61, 201–18. https://doi.org/10.1111/j.1467-9876.2011.01007.x

10.

Hellwig

(2003) Entwicklung und Anwendung parametrisierter Standard-Lastprofile . PhD thesis, Technische Universität München, Munich, Germany.

11.

Koch

, Hiller

, Pfetsch

, and Schewe

(2015) Evaluating Gas Network Capacities . Society for Industrial and Applied Mathematics, Philadelphia, PA. URL https://doi.org/10.1137/1.9781611973693

12.

Larson

and Dinse

(1985) A mixture model for the regression analysis of competing risks data. Journal of the Royal Statistical Society. Series C (Applied Statistics) , 34, 201–11.

13.

Leisch

(2004) FlexMix: A general framework for finite mixture models and latent class regression in R. Journal of Statistical Software , 11, 1–18. URL https://doi.org/10.18637/jss.v011.i08

14.

McLachlan

and Peel

(2000) Finite Mixture Models . New York: John Wiley & Sons. URL https://doi.org/10.1002/0471721182

15.

Meyer

, Kauermann

, and Smith

(2024) Interpretable modelling of retail demand and price elasticity for passenger flights using booking data. Statistical Modelling , 24, 82–106. URL https://doi.org/10.1177/1471082X221083343

16.

O’Hagan

, Murphy

, Scrucca

, and Gormley

(2019) Investigation of parameter uncertainty in clustering using a Gaussian mixture model via jackknife, bootstrap and weighted likelihood bootstrap. Computational Statistics , 34, 1779–1813. https://doi.org/10.1007/s00180-019-00897-9

17.

Omerovic

(2019a) Fitting Mixtures of Generalized Nonlinear Models . PhD thesis, Graz University of Technology, Graz, Austria.

18.

Omerovic

(2019b) flexmixNL: Finite Mixture Modeling of Generalized Nonlinear Models . R package version 0.0.1.

19.

Omerovic

, Friedl

, and Grün

(2022) Modelling multiple regimes in economic growth by mixtures of generalised nonlinear models. Econometrics and Statistics , 22, 124–135. URL https://doi.org/10.1016/j.ecosta.2021.02.008

20.

Pinheiro

and Bates

(1995) Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics , 4, 12–35. URL https://doi.org/10.1080/10618600.1995.10474663

21.

Rigby

and Stasinopoulos

(2005) Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society. Series C (Applied Statistics) , 54, 507–554. URL https://doi.org/10.1111/j.1467-9876.2005.00510.x

22.

Turner

and Firth

(2007) gnm: A package for generalized nonlinear models. R News , 7, 8–12.

23.

Turner

and Firth

(2018) gnm: Generalized Nonlinear Models . R package version 1.1-0.

24.

Wei

(1998) Exponential Family Nonlinear Models . Number 130 in Lecture Notes in Statistics. Singapore: Springer-Verlag.

25.

Willemsen

, Russo

, Lesaffre

, and Leão

(2017) Flexible multivariate nonlinear models for bioequivalence problems. Statistical Modelling , 17, 449–467. URL https://doi.org/10.1177/1471082X17706018.

26.

Wuertz

(2018) timeDate: Rmetrics – Chronological and Calendar Objects . R package version 3043.102.

27.

Young

and Hunter

(2010) Mixtures of regressions with predictordependent mixing proportions. Computational Statistics and Data Analysis , 54, 2253–2266. URL https://doi.org/10.1016/j.csda.2010.04.002

28.

ZAMG (2022) Zentralanstalt für Meteorologie und Geodynamik. URL https://www.zamg.ac.at/cms/de/klima/klimauebersichten/jahrbuch. [accessed on 31 January 2022].

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.33 MB

0.00 MB

Modelling maximum daily gas flow on exits of gas transmission networks using mixtures

Abstract

Keywords

1 Introduction

2 Model specification

3.1 EM algorithm

5 Simulation study for two-component mixtures

5.1 Setup

Parameter specification of the two-component gamma mixture model with sigmoid mean function (2.4).

Ranges for the starting configuration of the gamma mixture models.

5.3 Predicting expected maximum gas flow values

6 Application to gas flow data in Western Austria

6.1 Data

Figure 1

MC mean values evaluated at temperatures xi ∈ {−15, −10, −5, 0, 5, 10, 20, 25} with 95% confidence intervals based on standard deviations (SD) and asymptotic standard errors (ASE).

Maximum daily gas flow as scatter plot (left) and boxplot with binned temperature values (right) with distinction between non-working days (black) and working days (red).

Table 3

Model configurations of including the indicator di for working days and non-working days.

Table 4

Parameter estimates with standard errors below.

Fitted one-component models for Model 1 (non-working days shown as solid blue line, working days shown as dashed blue line) and Model 2 (red solid line).

Fitted marginal means (left Model 1, right Model 2) and observations coloured according to working days (red) and non-working days (blue).

Localized AIC values for Model 1 and number of components between 1 and 5 (left entire temperature scale, right temperatures below 5◦).

Localized AIC values for Model 2 and number of components between 2 and 5 (left entire temperature scale, right temperatures below 5◦).

Predicted means for Model 1 with one component and Model 2 with two components with standard errors below.

Footnotes

Acknowledgements

Declaration of Conflicting Interests

Funding

Supplemental material

References

Supplementary Material

MC mean values evaluated at temperatures x_i ∈ {−15, −10, −5, 0, 5, 10, 20, 25} with 95% confidence intervals based on standard deviations (SD) and asymptotic standard errors (ASE).

Model configurations of including the indicator d_i for working days and non-working days.