Sage Journals: Discover world-class research

Abstract

The failure processes of heterogeneous repairable systems with minimal repair assumption can be modelled by nonhomogeneous Poisson processes. One approach to describe an unobserved heterogeneity between systems is to multiply the intensity function by a positive random variable (frailty term) with a gamma distribution. This approach assumes that the relative frailty distribution among survivors is independent of age. Where systems are being continuously repaired and modified, the frailty distribution may be dependent on the system’s age. This paper investigates the application of the inverse Gaussian (IG) frailty model for modelling the failure processes of heterogeneous repairable systems. The IG frailty model, which combines the power law model and inverse Gaussian distribution, assumes that the relative frailty distribution among survivors becomes increasingly homogeneous over time. We develop the maximum likelihood for the IG frailty model, a method for event prediction, and investigate the effect of accuracy of the IG estimator and mis-specification of the frailty distribution through a simulation study. The mean estimates of the scale and shape parameters of the intensity function are examined for bias and efficiency loss. We find that the developed estimator is robust to changes in the input parameters for a relatively large sample sizes. We investigate the robustness of selecting an IG compared with a gamma frailty model. The developed IG model is applied to real data for illustration showing an improvement on existing models.

Keywords

Reliability analysis model mis-specification frailty model repairable system component heterogeneity

Introduction

In the reliability literature, industrial systems are generally classified as either non-repairable or repairable.^1,2 A non-repairable system is one where after a failure has occurred the system stops functioning and the system cannot be repaired.² A non-repairable system can fail only once, and a lifetime model such as the Kaplan-Meier, Exponential, or Weibull distribution provides the distribution of the time to failure of such systems.² A repairable system is a system that can repaired and returned to operation without the need to replace the system.¹

For repairable systems, we distinguish between the different system states after being repaired.³ The system’s performance may return to the same state that it was at the start of operation, referred to as ‘as good as new’, or its performance may be returned to the same state as before the failure, that is, ‘as bad as old’ condition.³ The former relates to a renewal process (RP) which is characterised by a renewal distribution describing the time between failures, and which is also called a perfect repair model, and corresponds to an assumption of the replacement of a system. ‘As bad as old’ condition can be modelled by a nonhomogeneous Poisson process (NHPP) which is characterised by the rate of occurrence of failures, and corresponds to the minimal repair assumption.² Minimal repair means that a failed system is restored just back to a functioning state, and after repair the system continues as ‘if nothing had happened’.² This implies that the likelihood of system failure, right after a failure and subsequent repair, is the same as it was immediately before the failure. Repair times in this kind of modelling are assumed to be negligible.²

For many systems, the minimal repair assumption is appropriate to describe the maintenance regime adopted, that is, the purpose of repairing is to bring the system back to operation as soon as possible and the condition of the system is broadly the same as it was just before the failure occurred.^3,4 For such systems, NHPP is a more appropriate model to characterise the failure occurrences in the systems than a renewal process.³ NHPP models are flexible due to their assumption that events occur randomly in time, with rates which may vary with time.² NHPP allows the modelling of trends of the failure time occurrences such as whether a system is improving or deteriorating.³ Therefore, this paper will concentrate on the modelling of repairable systems using NHPP.

NHPP are widely used models in reliability, for example: for analysis of failures of electrical power transformers in Dias De Oliveira et al.⁵; for modelling of failures of ball bearing⁶; for failures of numerically controlled machine tools in Wang and Yu⁷; or for modelling software reliability.⁸ NHPP have also been applied to other real world applications, that is, to study noise exposure in Guarnaccia et al.⁹

This paper is concerned with the problem of predicting the behaviour of a system based on failure data from several similar systems.² According to Asfaw and Lindqvist,² Lindqvist¹⁰, Deep et al.¹¹ there may be unobserved heterogeneity among the systems which, if overlooked, may lead to inaccurate predictions. Heterogeneity among systems may be due to different material, different design, different location and so on^12,13 while homogeneity among systems may be driven by similar repairing practises, similar usage patterns, etc. An intuitive way of interpreting heterogeneity is to imagine an unknown covariate, with values that may vary among systems, which leads to an unexpected variation in the failure intensity of the different processes.³ The unobservable heterogeneity, which is often called frailty in the survival analysis literature, is typically modelled by multiplication of the intensity function by a positive random variable taking independent values across the systems with unit mean and therefore described by its variance.³

The study of heterogeneity is widely applied in various fields like politics,¹⁴ and medical research.¹⁵ In survival analysis, the effects of unobserved heterogeneity have been studied in several papers including but not limited to Vaupel et al.,¹⁶, Kheirietal.,¹⁷ Hanagal et al.¹⁸ In reliability engineering, the effects of unobserved heterogeneity have been studied for various component problems like failures,¹⁹ maintainability,²⁰ remaining useful life,²¹ and spare-part management.²² For non-repairable systems, some examples include Lin and AspLund ^23,24, Linetal.²⁵ that used the frailty model to analyse the lifetime of locomotive wheels.

For repairable systems, unobserved heterogeneity has been studied with the minimal repair assumption. Early works include.^26,27 Lindqvist et al.²⁸ developed a heterogeneous trend renewal process model, which generalises the HPP and NHPP, to capture unobserved heterogeneity in multiple repairable components. They introduced a gamma distributed multiplicative factor on the failure intensity. D’Andrea²⁹ suspected heterogeneity in the failure time data for mining trucks in Brazil. They assumed that the mining trucks were subject to minimal repair and thus modelled the data using NHPP with a gamma distributed frailty term.

Asfaw and Lindqvist² investigated heterogeneous population composed of independent NHPP using gamma-distributed frailty. Slimacek and Lindqvist³⁰ extended the basic NHPP to include covariates and unobserved heterogeneity in analysing wind turbine failure data. Lindqvist and Slimacek³ developed the method for parameter estimations in heterogeneous NHPP population when the distribution of frailty is unspecified. The NHPP model was extended to include covariates in Slimacek and Lindqvist.¹³ Most of these papers on minimal repair have parametrically modelled heterogeneity using the gamma frailty model in which unobserved effects are assumed to be gamma distributed.

Hougaard³¹ introduced the inverse Gaussian (IG) distributions for modelling unobserved effects. The IG distribution has a unimodal density and is a member of the exponential family. Its shape resembles that of other skewed density functions, such as the lognormal and gamma distributions. Chhikara and Folks³² studied the IG distribution and found that there are many striking similarities between the statistics derived from this distribution and those of the normal distribution. These properties make the IG potentially attractive for modelling purposes for survival data.¹⁸

The IG distribution has some advantages as a frailty distribution. It provides flexibility in modelling, when early occurrences of failures are dominant in a life time distribution and its failure rate is expected to be non-monotonic.¹⁸ Also IG is almost an increasing failure rate distribution when it is slightly skewed and hence is also applicable to describe lifetime distribution which is not dominated by early failures.¹⁸ Hougaard³¹ noted that survival models with gamma and IG frailties behave very differently, specifically that the relative frailty distribution among survivors is independent of age for the gamma, but becomes more homogeneous with time for the IG. The conclusion was derived by observing the coefficient of variation of the two frailty distributions of survivors. For the gamma distribution, the coefficient of variation is a constant. However, for the IG distribution, the coefficient of variation is a decreasing function of time. In fact, a few studies in areas such as medicine and epidemiology have suggested the IG frailty model as an alternative to the gamma frailty model for modelling unobserved effects.^15,17,33,34

The likelihood function for the IG frailty model can be easily obtained and has closed-form representation, which indicates fast and simple estimation of parameters.^18,35 Despite these desirable properties, for failure data with the minimal repair assumption, the application of the NHPP model in which the unobserved effects are assumed to be IG distributed has not be fully investigated. The first objective of this paper is to evaluate the application of the NHPP model with IG distributed frailties for analysing failure data from repairable systems and to compare its results with the gamma distributed frailties.

In a model with unobserved heterogeneity, it is necessary to define the distribution of the unobserved effects.^36,37 Since the modelled heterogeneity is unobservable, the appropriate choice of distribution of the unobserved effects is not easily discernable.^3,38 Furthermore, the choice of the distribution of unobserved effects can give interesting general results in terms of the variance of the unobserved effects).³ For instance, a large variance could indicate deficiencies in the choice of the distribution which may influence the model fit.^3,38 It is therefore useful to examine the extent to which mis-specification of the frailty distribution affects the validity of intensity function estimators.³⁸ Yet the impact of frailty distribution mis-specification with regards to the minimal repair assumption has not been investigated. Another objective of this paper is to examine the impact of wrongly specifying the frailty distribution in a NHPP model. To accomplish this objective, NHPP model with IG distributed frailties will be developed and compared with a gamma distributed frailty model through a simulation study and analysis of a real dataset. Statistical fit and prediction performance of the two models will be compared over different heterogeneity levels, sample sizes and component failure behaviours.

Another issue of interest in this paper pertains to event prediction for repairable systems subject to minimal repair and unobserved heterogeneity. The ability to predict the occurrence of failure events at an individual unit level can aid optimal maintenance decision making for individual components.¹¹ However, the majority of research on unobserved heterogeneity with the minimal repair assumption have predominantly focussed on investigating the significance of covariates and the frailty term in the fitted model rather than prediction for the system and/or individual components. The few works that have considered event prediction for point processes with unobserved heterogeneity includes: Deep et al.¹¹ that used a semi-parametric Andersen and Gill model for failure prediction of a new component in a Teleservice system using collected data from old units; and Jahani et al.³⁹ who developed a multivariate Gaussian convolution process (MGCP) for fleet-based event prediction in which failure prediction for an individual unit is conducted using data collected from other units. The third objective of this paper is to develop a method for predicting the occurrence of failure events at the component level based on the NHPP model with gamma and IG distributed unobserved heterogeniety effects. To accomplish this objective, an empirical Bayes framework will be adopted to update the frailty term.

The contribution of this study is threefold. First, we develop an IG frailty model and the parameter estimators for repairable systems whose components become homogeneous over time. Using a simulation study, we examine the performance of the estimators. Furthermore, we investigate the impact of mis-specifying the frailty distribution in a NHPP model. Finally, using empirical Bayes framework, we develop a method for prediction of a component’s mean residual life and prediction of the expected number of failures at the component and system levels.

The rest of the paper is organised as follows. First, we describe a general system with heterogeneous components. We develop the IG frailty model and an estimator for the IG frailty model. We describe a simulation study to investigate the accuracy of the estimator. This study is extended to further examine the impact of mis-specification of the frailty distribution. We present an analysis of a real dataset. We conclude by reflecting on the implications of our findings, the limitations of our study and provide suggestions for further work.

System description

Consider a system subject to minimal repairs upon failure. When a minimal repair is performed upon failure, the times between subsequent failures may not be identically distributed, which constitutes an NHPP.^1,40

Consider that the system is a ‘happy system’ in its burn in phase. A ‘happy system’ has increasing inter-failure times. Also, consider that within a mission time, components in the wear-out phase (components with decreasing inter-failure times) are removed without replacement whereas components in the burn-in phase (relatively new components) are subject to minimal repair upon failure. As the number of the removed components increases, the relative frailty among the remaining working components in the system become homogeneous over time.

This paper shall concentrate on the most commonly used power law intensity function to characterise the ROCOF (Rate of occurrence of failures) of the NHPP (see e.g. Rausand and Hoyland⁴¹). One reason for its popularity is that the power law as a function of time $t$ is of the same form as the hazard rate of a Weibull distribution.² The power law model has a good fit to failure data from repairable systems and is quite effective in representing a system which is experiencing reliability improvement (i.e. inter-failure times are increasing).⁴² The power law model is given as:

ψ_{0} (t) = ω ρ t^{ρ - 1},

(1)

where $ψ_{0} (t)$ is the intensity function at time $t$ , $ω$ is the scale parameter and $ρ$ is the shape parameter that controls the shape of the curve. The parameter $ρ$ in the power law model gives the following information about the system: if $ρ > 1$ , then the system is deteriorating (sad); if $0 < ρ < 1$ , then the system is improving (happy) and if $ρ = 1$ the NHPP model reduces to an HPP.

Consider that a system has m independent repairable components. Each component is observed from the start of operation to time $τ$ . Let $n_{j}$ be the recorded number of failures for the $j^{th}$ component and let $t_{ij}$ be the age at the $i^{th}$ occurrence of failure for component $j$ , where $i = 1, 2, \dots, n_{j}$ , and $j = 1, 2, \dots, m$ . We assume that the number of failures vary across the components and that there is no available covariate recorded for each component (however this information could be easily incorporated to the modelling). A positive random variable $z_{j}$ , drawn from a distribution $f (z_{j}; θ)$ , is included in the NHPP model to account for component unobserved effects, where $θ$ is the variance parameter. $z_{j}$ acts multiplicatively on the intensity function,² making the model a NHPP with random effect (also referred to as a frailty model²⁹). The variance parameter $θ$ is the parameter of interest in which small values of $θ$ reflects homogeneity in the failure pattern of a group of components and large values of $θ$ reflects high heterogeneity in the failure pattern of a group of the components. To make the baseline intensity function identifiable, a restriction is placed on $f (z_{j}, θ)$ such that the frailties (random effects) are assumed to have an expected value $E (z_{j}) = 1$ and $Var (z_{j}) = θ$ . Thus components with $z_{j} > 1$ will fail more often than components with $z_{j} < 1$ .¹⁶

Conditional on the frailty term $z_{j}$ , the intensity function for component $j$ can be expressed as:

ψ (t | z_{j}) = ω_{j} ρ t_{ij}^{ρ - 1} = λ z_{j} ρ t_{ij}^{ρ - 1},

(2)

where $ω_{j} = λ z_{j}$ is the scale parameter and $ρ$ is the shape parameter. We assume that the components in system have the same ageing behaviour (i.e. same shape parameter $ρ$ ) but different magnitudes (i.e. scale parameter). As a result, the scale parameter $ω_{j}$ will be made to be component specific by making $λ$ fixed and introducing a random effect $z_{j}$ which is modelled by a parametric distribution. We will assume that $λ$ has a value of one to reduce the effect $λ$ might have on each $z_{j}$ .

IG frailty model

This section presents an IG frailty model based on the NHPP with IG distributed random effects. IG distribution has a simple Laplace transform which is useful for deriving the reliability function.^43,44 In addition, gamma and IG distributions both have uni-modal density functions.⁴⁴ IG distribution is described by two characteristics, namely, a mean parameter $μ > 0$ and precision parameter $δ > 0$ . The two-parameter IG distribution is given as:

f (z) = \sqrt{\frac{δ}{2 π z^{3}}} ex p^{(\frac{- δ {(z - μ)}^{2}}{2 z μ^{2}})} .

(3)

As mentioned above, the frailty model poses restrictions on the mean and variance on the distribution. Let $E (z) = 1 = μ$ and $Var (z) = θ = \frac{μ^{3}}{δ}$ . The one-parameter IG distribution is:

f (z) = \frac{1}{\sqrt{2 π θ}} z^{\frac{- 3}{2}} \exp (\frac{- {(z - 1)}^{2}}{2 z θ}) .

(4)

Maximum likelihood for IG frailty model

Assume that the random effect $z_{j}$ of the conditional intensity equation (2) is drawn from an IG distribution equation (4), then the conditional likelihood function for component $j$ with random effect $z_{j}$ is given below as:

L_{j} (λ_{0} (t_{ij}) | z_{j}) = (Π_{i = 1}^{n_{j}} z_{j} λ_{0} (t_{ij})) ex p^{- z_{j} Λ_{0} (τ)}

(5)

where $λ_{0} (t_{ij}) = λ ρ t_{ij}^{ρ - 1}$ and $Λ_{0} (τ) = λ τ^{ρ}$ . $τ$ is the observation length of component j. Here, $j$ is a fixed Identification number for a component. $t_{ij}$ represents each failure time $t_{i}$ observed over the entire life span of component $j$ . $z_{j}$ is a fixed value generated for component $j$ and it influences each failure time $t_{i}$ of component j for the rest of its life.

The total log likelihood function for all the components is given by:

\begin{matrix} l = n \log λ + n \log ρ + (ρ - 1) \sum_{j = 1}^{m} \sum_{i = 1}^{n_{j}} \log t_{ij} + mlog (2) \dots \\ + \frac{m}{θ} - \frac{m}{2} \log (2 π θ) - \sum_{j = 1}^{m} \frac{(n_{j} - 1 / 2)}{2} \log (1 + 2 θ λ τ^{ρ}) \dots \\ + \sum_{j = 1}^{m} \log k_{\frac{n_{j} - 1 / 2}{2}} [\frac{(1 + 2 θ λ τ^{ρ})}{θ}], \end{matrix}

(6)

where $k_{(\frac{n_{j} - 1 / 2}{2})} [.]$ is a modified bessel function of the second kind. We present the derivation of the log likelihood function in Appendix A. Taking the derivative of the log likelihood function with respect to $λ$ , $ρ$ and $θ$ :

\frac{δ l}{δ λ} = \frac{n}{λ} - \frac{\sum_{j = 1}^{m} τ^{ρ} k_{\frac{n_{j} + 1 / 2}{2}} [\frac{(\sqrt{r_{o}})}{θ}]}{\sqrt{r_{o}} k_{\frac{n_{j} - 1 / 2}{2}} [\frac{(\sqrt{r_{o}})}{θ}]}

(7)

where $r_{o} = 2 θ λ τ^{ρ} + 1 .$

\frac{δ l}{δ ρ} = \frac{n}{ρ} + \sum_{j = 1}^{m} \sum_{i = 1}^{n_{j}} \log t_{ij} - m λ τ^{ρ} \log τ .

(8)

\begin{matrix} \frac{δ l}{δ θ} = - \frac{n}{θ^{2}} - \frac{n}{2 θ} + \sum_{j = 1}^{m} \frac{n_{j} - 1 / 2}{θ} + \sum_{j = 1}^{m} \frac{\sqrt{r_{o}} k_{\frac{n_{j} + 1 / 2}{2}} [\frac{(\sqrt{r_{o}})}{θ}]}{θ^{2} k_{\frac{n_{j} - 1 / 2}{2}} [\frac{(\sqrt{r_{o}})}{θ}]} \dots \\ - \sum_{j = 1}^{m} \frac{λ τ^{ρ} k_{\frac{n_{j} + 1 / 2}{2}} [\frac{(\sqrt{r_{o}})}{θ}]}{θ \sqrt{r_{o}} k_{\frac{n_{j} - 1 / 2}{2}} [\frac{(\sqrt{r_{o}})}{θ}]} . \end{matrix}

(9)

Estimates of $λ$ , $ρ$ and $θ$ can be obtained by setting the derivatives to zero. The expression for the estimators of $λ$ , $ρ$ and the heterogeneity parameter $θ$ cannot be derived analytically. One way to deal with this problem is to use a numerical method such as Newton-Raphson’s method which run a gradient descent algorithm to optimise the likelihood function.² Gradient descent approaches assume that the likelihood function to be maximised is smooth and concave. For the maximum likelihood estimation process of gamma frailty model (see, e.g. Asfaw and Lindqvist²).

Simulation study

In this section, we conduct a simulation study. We describe the simulation design and data simulated. We assess the performance of the IG estimators with respect to bias in the estimates of the scale $λ$ , shape $ρ$ and heterogeneity parameter $θ$ given some known input parameter values. Finally, we assess the robustness of the IG and gamma frailty model in a mis-specification study.

Simulation design

Throughout the simulation study, the underlying failure process for the components in a system are considered to follow a nonhomogeneous Poisson process (NHPP) with basic rate of occurrence of failures of a power law process $λ ρ t_{ij}^{ρ - 1}$ and are conditional on a frailty term $z_{j}$ whose distribution $f (z_{j}, θ)$ is known. When the IG frailty model estimators are examined, the frailty term $z_{j}$ is IG distributed. However, when the effectiveness of the IG and gamma frailty models are examined, the frailty term $z_{j}$ is either gamma or IG distributed. For each process, input values of $λ$ are fixed to 1 while $ρ$ and $θ$ are varied. Nineteen sample sizes, $n$ = 10–100 with five step size increments, are used to assess the impact of increasing the sample size on parameter estimates. Sample size in this paper refers to the number of components in the system. The mean of 1000 parameter estimates are assessed for $ρ$ and $θ$ . Equal observation length of $τ = 0$ – $τ$ = 10 is assumed for each system. Whilst it is, in principle, straight forward to generalise the likelihood function to the case of different observation lengths, doing so complicates the computation of estimates.² Thus, in order to simplify the estimation, the observation lengths are made equal.

To perform the simulation study, data is generated. The following algorithm is used to generate failure time data from each frailty model. If the failure times of a component are assumed to be independent given the frailty term and that the sequence of failure times forms a counting process, samples of an NHPP can be generated from an Homogeneous Poisson Process (HPP). Consider an HPP with intensity function $λ (t)$ , an interval $(t_{(i - 1)}, t_{i})$ for component $j$ and mean function:

Λ (t_{(i - 1)}, t_{i}) = Λ (t_{i}) - Λ (t_{(i - 1)}),

(10)

where the random variables $Λ (t_{(i - 1)}, t_{i})$ are independent and identically exponential distributed with mean 1. Suppose that there is no failure at installation time $(t = 0)$ , then by inverse probability method the reliability function can be expressed as:

ex p^{(- Λ (t_{(i - 1)}))} = U_{(i - 1)}

(11)

and

Λ (t_{(i - 1)}) = - \log (U_{(i - 1)}),

(12)

where $U_{(i - 1)}$ is drawn from a uniform distribution with an interval between 0 and 1. Suppose there are covariates then the mean function conditional on the frailty term becomes

Λ (t_{(i - 1)}) = Λ_{0} (t_{(i - 1)}) z_{j} \exp (β X),

(13)

where $β$ is a vector of regression coefficients. The covariates are depicted by $X$ such that when $X = x (t)$ the covariates are time varying and $X = x$ implies time invariant covariates. If the intensity function is based on the power law model and each $z_{j}$ is drawn from either a gamma or an IG distribution then

Λ (t_{(i - 1)}) = λ {t_{(i - 1)}}^{ρ} z_{j} \exp (β X) .

(14)

Solving for a failure time $t_{(i - 1)}$ from equation (14) leads to the expression given by:

t_{(i - 1)} = {[\frac{Λ (t_{(i - 1)})}{z_{j} λ \exp (β X)}]}^{1 / ρ}

(15)

and

t_{(i - 1)} = {[\frac{- \log (U_{(i - 1)})}{z_{j} λ \exp (β X)}]}^{1 / ρ} .

(16)

The NHPP process of component j observed up to the $i^{th}$ failure time $t_{i}$ can be generated by the following recurrence formula

t_{i} = {[\frac{- \log (U_{i})}{z_{j} λ \exp (β X)} + {t_{(i - 1)}}^{ρ}]}^{1 / ρ} .

(17)

If there are no available covariates that is, $X = 0$ , equation (16) becomes

t_{(i - 1)} = {[\frac{- \log (U_{(i - 1)})}{z_{j} λ}]}^{1 / ρ}

(18)

and the recurrence formula becomes

t_{i} = {[\frac{- \log (U_{i})}{z_{j} λ} + {(t_{(i - 1)})}^{ρ}]}^{1 / ρ} .

(19)

To illustrate how to generate data when covariates are unavailable, a single NHPP process for component j can be simulated by the recurrence formula in equation (19) using the algorithm below.

Set parameter values for $λ$ , $ρ$ and $θ$

Set a value for $τ$

Draw a random value $z_{j}$ from the known frailty distribution $f (z_{j}, θ)$

Set $t_{0} = 0$

Set $i = 1$

Draw $U_{i}$ from a uniform distribution $~ U (0, 1)$

Generate the first failure time by:

t_{i} = {[\frac{- \log (U_{i})}{z_{j} λ} + {(t_{0})}^{ρ}]}^{1 / ρ} .

(20)

8. If $t_{i} > τ$ stop simulation otherwise $i = i + 1$

9. Draw $U_{i}$ from a uniform distribution $~ U (0, 1)$

10. Generate the next failure time as:

t_{i} = {[\frac{- \log (U_{i})}{z_{j} λ} + {(t_{(i - 1)})}^{ρ}]}^{1 / ρ} .

(21)

11. Repeat steps 8–10 until $t_{i} > τ$ .

Evaluation of IG estimator

To assess the performance of the IG frailty model’s estimator developed earlier in this paper, the Bias $%$ of the mean from each estimated parameter will be determined. Given the input value of an arbitrary parameter p with its estimator $\hat{p}$ . The Bias of the estimates after 1000 simulations will be calculated using:

Bias (p) = \frac{\sum_{n = 1}^{1000} {\hat{p}}_{n}}{1000} - p .

(22)

We assess the IG estimators by observing the $Bias (p)$ from the mean of 1000 estimates of $ρ$ and $θ$ . The results are presented in Figure 1. The input values for $λ$ , $ρ$ and $θ$ are 1, $(0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9)$ and $(0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1, 2.4, 2.7)$ , respectively.

Figure 1.

Plot of the Bias on estimates from IG frailty model estimators: (a) bias on ρ input values and (b) bias on θ input values.

From Figure 1(a) we observe that the bias in the estimated $ρ$ values was 2.5 $%$ for sample size 10 however the bias in the estimated $ρ$ values reduced as the sample size increased. From Figure 1(b), the bias in the estimated $θ$ values ranged between $- 0.3$ and $0.1$ when the sample size is set to 10 but the estimated values rapidly improve and converge to the input value as the sample size is increased. We see that the estimator is robust to different choices of $ρ$ and $θ$ , and beyond a sample size of 10, the estimator performs well.

Mis-specification study of gamma and IG frailty models

In this section, we assess the performance of the gamma and IG estimators to mis-specification of the frailty distribution. When one of the models is known to be the correct frailty model from which the data are generated, and the other frailty model will in consequence be wrongly specified. The performance of the wrongly specified model will be assessed in terms of the model’s goodness of fit to the generated data and prediction accuracy when predicting the expected number of component failures. Since model selection using only maximum likelihood could be misleading due to variation in data, the sample sizes will be chosen to be sufficiently large in order to reduce the probability of selecting the wrong model.⁴⁵

Analysis of the robustness of each wrongly specified frailty model (gamma or IG) is conducted by an assessment of the proportion of selections of each wrongly specified model using Akaike information criterion (AIC). AIC is used for goodness-off fit (GOF) test because AIC is one of the commonly used information-based criteria for model selection⁴⁶ and the AIC has been widely used either on its own or together with other tests (e.g. Barabadi et al.,²² Izumi and Ohtaki⁴⁷). AIC is similar to the Bayesian information criterion (BIC) in the sense that they both compare maximum likelihood values to select the appropriate model. Compared to AIC, BIC penalises model complexity more heavily meaning that more complex models will perform worse and will, in turn, be less likely to be selected.⁴⁸ In this paper, the Gamma and IG frailty models considered have the same number of parameters and similar level of complexity. Thus, model selections from BIC will not differ from AIC because the penalty term in the BIC will have the same effect on the two model’s values. AIC has been widely used to compare frailty models in the literature some examples include.^49,50 The expression of AIC is given by⁴⁶:

AIC = - 2 \log (L) + 2 p,

(23)

where $L$ is the maximised likelihood value and $p$ is the number of parameters in the model. When two models are compared, AIC considers the model with the smallest AIC value to have a better fit to the data. Further information on the AIC can be found in.⁴⁶ We will generate 1000 simulated data for fixed input values of $λ$ , $ρ$ and $θ$ and then we will count the number of times each of fitted model is chosen by the AIC. Out of the 1000 simulated data, we will note the number of times the wrongly specified model is selected as the better model.

Furthermore, we examine the robustness of each wrongly specified frailty model in terms of their prediction of the expected number of component failures in the simulated data. The expected number of component failures in the system in a specific interval is given as:

E [N_{S} (t_{1}, t_{2})] = \sum_{j = 1}^{m} E [N_{j} (t_{1}, t_{2})],

(24)

where $E [N_{j} (t_{1}, t_{2})]$ is the expected number of the failures of the $j^{th}$ component between times $t_{1}$ and $t_{2}$ . $E [N_{j} (t_{1}, t_{2})] = \frac{1}{θ} \log (θ Λ_{0} (t_{1}, t_{2}) + 1)^{- \frac{1}{θ}}$ and $E [N_{j} (t_{1}, t_{2})] = \frac{1}{θ} (\sqrt{1 + 2 θ Λ_{0} (t_{1}, t_{2})} - 1)$ for gamma and IG frailty models respectively. $E [N_{j} (t_{1}, t_{2})]$ is derived from the marginal reliability function for each component. We present the derivation of $E [N_{j} (t_{1}, t_{2})]$ and the marginal reliability function in the Appendix B and C.

We compare the predicted results with those from the correct model using root mean squared error (RMSE). We then assess the proportion of the wrong model selected as the better model by the RMSE. The expression of RMSE is given by:

RMSE = \sqrt{\frac{\sum_{i = 1}^{n_{I}} {(f_{i} - o_{i})}^{2}}{n_{I}}}

(25)

where $f_{i} = E [N_{S} (t_{i}, t_{i + 1})]$ is the forecasts (expected number of component failures between time $(t_{i}, t_{i + 1})$ and $o_{i}$ depicts the observed number of component failures between time $(t_{i}, t_{i + 1})$ from the simulated data. $n_{I}$ is the number of time intervals of equal length that the observed data is grouped into and $t_{n_{I}} = τ$ . To compute RMSE, the difference between the forecasts and observed number of component failures are squared and averaged over the number of time intervals to get the mean squared error (MSE); RMSE is the square root of MSE.⁵¹ The model with the smallest RMSE value has the best prediction accuracy.

The performance of each specified model is assessed in four cases involving fixed $ρ$ , and $θ$ values.

Case one involves setting $θ$ –0.3 to reflect low heterogeneity between components and modifying $ρ$ .

Case two involves setting $θ$ –3 to reflect high heterogeneity between components and modifying $ρ$ .

Case three involves setting $ρ$ –0.3 to reflect components’ early life behaviour with frequent failures and modifying $θ$ .

Case four involves setting $ρ$ –0.9 to reflect components with similar behaviour as those in the mid-life phase and modifying $θ$ .

In each case, when one of the two parameters is fixed, the other is varied. For fixed settings of $ρ$ , the $θ$ values analysed are $(0.3, 0.6, 0.9, 1.2, 1.5, 1.8, 2.1, 2.4, 2.7, 3)$ . For fixed settings of $θ$ , the $ρ$ values analysed are $(0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9)$ . The result for each case is presented in Figures 2 to 5.

Figure 2.

Plot of probability of selecting the wrong model when heterogeneity is low $θ = 0.3$ : (a) IG model selections by AIC, (b) gamma model selections by AIC, (c) IG model selections by RMSE and (d) gamma model selections by RMSE.

Figure 3.

Plot of probability of selecting the wrong model when heterogeneity is high $θ = 3$ : (a) IG model selections by AIC, (b) gamma model selections by AIC (c) IG model selections by RMSE and (d) gamma model selections by RMSE.

Figure 4.

Plot of probability of selecting the wrong model when early life failures are considered $ρ = 0.3$ : (a) IG model selections by AIC, (b) gamma model selections by AIC, (c) IG model selections by RMSE and (d) gamma model selections by RMSE.

Figure 5.

Plot of probability of selecting the wrong model when component failures are similar to the mid-life phase $ρ = 0.9$ : (a) IG model selections by AIC, (b) gamma model selections by AIC, (c) IG model selections by RMSE and (d) gamma model selections by RMSE.

Case One: Low heterogeneity

In case one, we set $θ = 0.3$ to reflect low heterogeneity between components. The results are presented in Figure 2. For low heterogeneity, the likelihood of mis-specifying the IG or the gamma model is zero for $ρ$ values from 0.4 to 0.9. In contrast, when $ρ = 0.3$ the proportion of assumed model selection was 1.5 $%$ for IG and 1 $%$ for gamma for sample size less than 25. The probability of mis-specifying the models was zero for sample sizes greater than 25. Thus, for data with low heterogeneity when one wrongly specify either the IG or gamma frailty model, either in terms of model fit or prediction, the chances of the wrong model being selected is very small.

Case Two: High heterogeneity

For case two, we set $θ$ –3 to reflect high heterogeneity between components. The results are presented in Figure 3. From Figure 3(a) and (b). We see that in terms of model fit, the proportion of cases where the wrong model is selected when the sample size is less than 50 ranged between 3 $%$ –11 $%$ and 1 $%$ –10 $%$ for IG and gamma model respectively. However, as the sample size is increased, the proportion reduces to zero for $ρ$ values from 0.3 to 0.8. For $ρ = 0.9$ the proportion of wrong models selected can be seen to slowly decline. Because $ρ = 0.9$ mimics the behaviour of components in the mid-life stages, failure observations in the data are few. The few failure occurrence together with high heterogeneity between components increases the possibility of selecting the wrong model. In contrast, the proportion of the wrong model selected can be seen to reduce as sample size increased.

The results for determining the appropriate model when using prediction as the measurement are presented in Figure 3(c) and (d). We observe a similar reduction in incorrectly selected models when gamma is wrongly specified. When IG is wrongly specified, the proportion of IG model selection is zero for $ρ$ values from 0.4 to 0.9. In contrast, when $ρ = 0.3$ the proportion of IG model selection was up to 6% for sample size less than 50 and zero otherwise. The 6% selection of the IG model may be due to the sensitivity of the RMSE to outliers in the predicted number of component failures. For data with high heterogeneity, the probability of selecting the wrong model whether for model fit or for prediction is low and only happens when the sample size is small.

Case Three: Component early life behaviour

In case three, we set $ρ$ to 0.3 to reflect components’ early life behaviour in which failures occur due to component defects or installation issues. The results for case three are presented in Figure 4. From the four plots 4a, 4b, 4c and 4d we see that in terms of prediction or model fit, the proportion of wrong model selections (whether IG or gamma) could be as high as 10 $%$ when the sample size is less than 50 and heterogeneity is high, that is, close to 3. However, as the sample size is increased from 50, the likelihood of selecting the wrong model reduces to zero. More broadly, we see that the proportion of wrong models selected increases as $θ$ increases from 0.3 to 3. This supports our findings when we compare sections 4.3.1 and 4.3.2.

Case Four: Component mid-life behaviour

In case four, we set $ρ$ –0.9 to reflect components nearing the mid-life stages where time between component failures are almost constant. The results are presented in Figure 5. From Figure 5(a) and (b) we see that in terms of model fit, the proportion of selections of the wrong model increased as heterogeneity levels increased for IG and gamma model reaching 12 $%$ and 10 $%$ , respectively. In contrast, as the sample size is increased from 10 to 100 the proportion of wrong model selections reduced for all $θ$ values from 0.3 to 3.

Summary of analysis

Reflecting on our study, several conclusions can be drawn from our analysis. First, when the sample sizes increase above 50, the likelihood of selecting the wrong model significantly reduces for almost all choice of parameters. This provides confidence that when we have many units/components, we are likely to select the correct model. Second, the effect of heterogeneity for low sample sizes is noticeable. As we increase the heterogeneity from $θ = 0.3$ to $θ = 3$ , the probability of selecting the incorrect model increases. Third, we see little difference between the ability of the AIC or the RMSE to select the correct model. As a selection tool, the RMSE method is a slight improvement on the AIC, particularly when the underlying distribution is from an IG. It also seems to be better than the AIC for identifying the correct distribution when heterogeneity is high. When the distribution is known to be gamma, there is little difference between the two methods for selecting the correct distribution.

Application to classic dataset

We compare the two frailty models using a dataset from the literature, that is, failure times of air conditioner components from a set of airplanes studied by.²⁸ This data was used as²⁸ compared a Heterogenous trend renewal process, an NHPP model with gamma distributed random effect, and a Homogeneous Poisson process with gamma distributed random effect. Lindqvist²⁸ found that an NHPP model with gamma distributed random effect provides a better fit than Ordinary NHPP model, identifying heterogeneity among components in the data. Here we compare the IG and gamma frailty models using the successive failure times before truncation as given in Table 1 to see if the IG provides a better fit. Note that it is not possible to say whether or not the true data generating process comes from an IG or Gamma; we merely wish to explore if our model produces any improvements on existing methods.

Table 1.

Failures times for Air conditioners in 13 Airplanes.

Airplane number
7907	7908	7909	7910	7911	7912	7913	7914	7915	7916	7917	8044	8045
194	413	90	74	55	23	97	50	359	50	130	487	102
209	427	100	131	375	284	148	94	368	304	623	505	311
250	485	160	179	431	371	159	196	380	309		605	325
279	522	346	208	535	378	163	268	650	592		612	382
312	622	407	710	755	498	304	290		627		710	436
493	687	456	722	994	512	322	329		639		715	468
	696	470	792		574	464	332				800	535
	865	494	813		621	532	347				891	594
		550	842		846	609	544				934	728
		570			917	689	732					880
		649				690	811					907
		733				706	899					921
		777				812	945
		836					950
		865					955
		983					991

The results of fitting IG frailty model and gamma frailty model are summarised in Table 2. Parameter estimates for the gamma frailty model were obtained as $λ = 3.353 \times 10^{- 4}$ , $ρ = 1.1424$ and $θ = 0.1334$ while the parameter estimates of the IG frailty model were obtained as $λ = 5.861 \times 10^{- 4}$ , $ρ = 1.1186$ and $θ = 0.9412$ . The values of $θ$ by the two frailty models, show a low level of heterogeneity between the components of the system. Whilst Lindqvist²⁸ found the gamma frailty model as the better model for the data when compared to Ordinary NHPP model, based on AIC and RMSE values in Table 2, the IG frailty model has a marginally better fit compared with the gamma frailty model.

Table 2.

Parameter estimates of gamma and IG frailty models when fitted to Air conditioner failure time data.

	$λ$	$ρ$	Var $θ$	AIC	RMSE
gamma	0.003353	1.142425	0.133469	1335.281	2.893
IG	0.005861	1.118666	0.980574	1331.92	2.856

Furthermore, we compared predictions of the expected number of failures from the two models with observed data using RMSE. We found that there is little difference in the predictions of the observed number of system failures from the gamma and IG frailty models when time is less than 500 days (see Figure 6 for the plot of the predictions). However, in line with Hougaard³¹ findings that the relative frailty distribution among survivors from an IG model becomes more homogeneous with time, we found that once we extend beyond 500 days the survivors become more homogeneous and the IG model substantially outperforms the gamma. Based on the selection values of AIC and RMSE in terms of model fit and prediction, one can infer that IG frailty model is marginally better in terms of model fit and better in terms of predictions of the expected number of component failures for the air conditioner data when number of days is greater than 500.

Figure 6.

Plot of the expected number of failures predicted by the IG and gamma frailty models compared to the observed cumulative number of failures within the interval 0–1000 h (system level prediction).

Equation (24) is useful for predicting the expected number of failures at the system level. For component level prediction, the predicted number of failures for all the components will always be the same, even if there are clear differences in the pattern of failure occurrences. The reason is that the value of the variance parameter $θ$ in the equation for the expected number of failures equation (24) is the same for each component. Thus, predictions of the expected number of failures for individual components will be computed by using an empirical Bayes approach to account for variability using the mean frailty estimates (see Carlin and Louis⁵² for more on empirical Bayes).

Using an empirical Bayes approach, we developed a method for individual component prediction of the mean residual life (MRL) and expected number of failures conditional on the expected frailty value. The developed method is then applied to make predictions for components in the Air conditioner data. The developed method uses Bayes theorem to update the frailty distribution for each $j$ component based on the observed data from the $j^{th}$ component. With the updated frailty distribution, we get a point estimate (in this case the mean of the distribution) of the value of the frailty term for the $j^{th}$ component. Then, we use the new frailty term to compute the intensity, survival function and make individual event prediction of the MRL and expected number of failures for each $j^{th}$ component.

Using Bayes theorem, we can have the posterior distribution of the frailty term $z_{j}^{*}$ of the $j^{th}$ component as:

\begin{matrix} h_{j} (z_{j}^{*} | θ) = \frac{L_{j} (λ_{0} (t_{ij}) | z_{j}) f (z_{j}; θ)}{\int_{0}^{\infty} L_{j} (λ_{0} (t_{ij}) | z_{j}) f (z_{j}; θ) d z_{j}}, \end{matrix}

(26)

where $h_{j} (z_{j}^{*} | θ)$ is the posterior distribution of $Z_{j}^{*}$ , $L_{j} (λ_{0} (t_{ij}) | z_{j})$ is the data likelihood for component $j$ , and $f (z_{j}; θ)$ is the prior distribution (i.e. the gamma or IG distributions). $L_{j (marg)} = \int_{0}^{\infty} L_{j} (λ_{0} (t_{ij}) | z_{j}) f (z_{j}; θ) d z_{j}$ is the marginal likelihood for component $j$ . We present the derivations of $h_{j} (z_{j}^{*} | θ)$ , and the point estimate of the value of the frailty term for IG and gamma in appendix B and C respectively. We used the maximum likelihood methods presented in the third section to estimate $θ$ , and parameters of $λ_{0} (t_{ij})$ then we used the Bayesian method to estimate the updated frailty term. The use of frequentist to estimate parameters and Bayesian method to estimate the value of the random effects is common in mixed-effects models (see e.g. Deep et al.,¹¹ Gebraeel et al.⁵³).

The expected number of failures of the $j^{th}$ component conditional on the expected frailty value between time $t_{1}$ and $t_{2}$ is given as:

E [N_{j} (t_{1}, t_{2} | \bar{z_{j}})] = \int_{t_{1}}^{t_{2}} \bar{z_{j}^{*}} λ_{0} (t) dt .

(27)

System level prediction of expected number of component failures is $E [N_{S} (t_{1}, t_{2})] = \sum_{j = 1}^{m} E [N_{j} (t_{1}, t_{2})] .$ The expression of the MRL is given as:

mr l_{j} (τ) = \int_{τ}^{\infty} R_{j} (x | τ) dx = \frac{\int_{τ}^{\infty} R_{j} (x) dx}{R_{j} (τ)},

(28)

where $τ$ is the end of observation length and $R_{j} (.) = ex p^{- \bar{z_{j}^{*}} Λ_{0} (.)} .$

For each of the components (Airplanes) in the Air conditioner data, we predicted the mean residual life and compared predictions of the expected number of failures from the IG and Gamma models with observed data using RMSE. The mean frailty estimate for each component $\bar{z_{j}}$ , mean residual life prediction, and RMSE values for the expected number of failures for each Airplane is presented in Table 3. The RMSE values in Table 3 shows that out of 13 components, the IG frailty model is better at predicting the expected number of failures for 11 components compared the gamma frailty model. For illustration we present the plots of the predictions of the expected number of failures for four Airplanes (Airplane 7909, Airplane 7914, Airplane 7915 and Airplane 7917) in Figure 7(a) to (d). From the Figures, we see a difference in the predictions of the observed number of failures for each of the Airplanes from the gamma and IG frailty models when time is more than 250 days.

Table 3.

Mean frailty $\bar{z_{j}^{*}}$ , Mean residual life and RMSE value for the expected number of failures of each Airplane.

	$\bar{z_{j}^{*}}$		Mean residual life (Days)		RMSE value for the expected no. of failures
Airplane ID	gamma	IG	gamma	IG	gamma	IG
Plane.7907	0.8197	0.4849	117.3	136.5987	1.0137	1.0012
Plane.7908	0.9412	0.6105	102.3372	108.7911	0.8338	0.8375
Plane.7909	1.4272	1.1561	67.8388	56.3213	0.8103	0.7582
Plane.7910	1.0019	0.6758	96.2018	98.3719	0.8536	0.8498
Plane.7911	0.8197	0.4849	117.3	136.5987	0.6726	0.6603
Plane.7912	1.0627	0.7424	90.7624	89.6296	0.7873	0.7863
Plane.7913	1.2449	0.9469	77.5854	70.3957	1.3625	1.3511
Plane.7914	1.4272	1.1561	67.8388	56.3213	1.3958	1.3664
Plane.7915	0.6982	0.3696	137.3962	178.5138	0.6737	0.6463
Plane.7916	0.8197	0.4849	117.3	136.5987	0.807	0.7939
Plane.7917	0.5767	0.2703	165.8234	242.7943	0.4922	0.4183
Plane.8044	1.0019	0.6758	96.2018	98.3719	0.774	0.7828
Plane.8045	1.1842	0.8781	81.5381	75.8732	0.7651	0.7569

Figure 7.

Plot of the expected number of failures predicted by the IG and gamma frailty models compared to the observed cumulative number of failures within the interval 0–1000 h (component level prediction): (a) Airplane 7909, (b) Airplane 7914, (c) Airplane 7915 and (d) Airplane 7917.

In terms of mean residual life predictions for each component Airplane, we apply Eq 28 for IG and gamma frailty models and found that there are differences in the MRL predictions of the Airplanes by the two models. For some Airplanes, for example Airplanes 7910, 7912 and 8044, the difference was as little as one or 2 days. In other Airplanes such as: 7909, 7911, 7913, 7914, 7915, 7916 and 7917, the differences ranged from eleven to 77 days. However, given the selection of the IG frailty by the RMSE values of a lot of the Airplanes, the MRL predicted values by the IG frailty model may be chosen for optimal maintenance decision making for most of the Airplanes.

Conclusion

In this paper, we applied the IG frailty model for analysing failure data from heterogeneous repairable systems. The IG frailty model, which combines the power law model and IG distribution, assumes that the relative frailty distribution among survivors becomes more homogeneous over time. This is in contrast to the commonly used gamma frailty models which assume that the relative frailty distribution among survivors is independent of age. The objectives of this paper were to evaluate the application of IG frailty model for analysing failure data from heterogeneous repairable systems, compare its results with the gamma frailty model, and develop a method for event prediction based on the IG and gamma frailty models. To accomplish the objectives, we developed the IG frailty model and a method for parameter estimation of the IG frailty model using maximum likelihood estimation and numerical methods. A comparison of the gamma and IG frailty models was conducted to examine whether both models are good alternatives of each other. Statistical fit of the gamma and IG frailty models as well as the prediction performance was thoroughly studied and compared. We found that regardless of the degree of heterogeneity or frequency of failures when early component behaviour is concerned, the probability of selecting a wrong model is low whether for model fit or for prediction purpose. A wrong model is only selected when the sample size is small. We applied the two frailty models to a classic dataset where the gamma frailty model had been studied. Our results found that the IG frailty model was better in terms of model fit and prediction when the number of days was greater than 500.

This research identified a number of areas for future research. First, the IG frailty model could be integrated into other degradation processes to explore its advances more broadly in reliability modelling. Instead of using the Gaussian or gamma distributions, further research could investigate other positive skewed distribution to describe unobserved heterogeneity. In addition, it would be valuable to further investigate the application of the IG frailty model. For example, when a system is operating in a varying environment, its degradation parameters are usually random and dependent on the operating environment, which can be potentially characterised by the IG frailty model. Another interesting research direction lies in the way of describing uncertainty in reliability modelling. Instead of using a probability distribution, fuzzy sets or interval values could be used to model the uncertain parameters or coefficients.

Footnotes

Appendix A

The derivation of the IG frailty model’s unconditional log likelihood function is presented below.

Since $z_{j}$ is unobservable from the data, the contribution of the $j^{th}$ component to the full likelihood is obtained when the marginal likelihood of the $j^{th}$ component is derived. The marginal likelihood of the $j^{th}$ component is derived after integrating out the random effects from the conditional likelihood function. The conditional likelihood function of the $j^{th}$ component is given as:

(29)

\begin{matrix} L_{j} (θ | λ_{0} (t_{ij})) = (Π_{i = 1}^{n_{j}} λ_{0} (t_{ij})) \dots \\ \times \int_{0}^{\infty} z_{j}^{n_{j}} ex p^{(- z_{j} Λ_{0} (τ))} f (z_{j}; θ) d z_{j}, \end{matrix}

where $f (z_{j}; θ)$ is the IG density.

(30)

\begin{matrix} L_{j} (θ | λ_{0} (t_{ij})) = (Π_{i = 1}^{n_{j}} λ_{0} (t_{ij})) \dots \\ \times \int_{0}^{\infty} \frac{1}{\sqrt{2 π θ}} z_{j}^{n_{j} - \frac{3}{2}} ex p^{(- z_{j} Λ_{0} (τ) - \frac{{(z_{j} - 1)}^{2}}{2 z_{j} θ})} d z_{j} . \end{matrix}

The marginal likelihood of the $j^{th}$ component is given as:

(31)

\begin{matrix} L_{j (marg)} = \frac{(Π_{i = 1}^{n_{j}} λ_{0} (t_{ij})) 2 ex p^{\frac{1}{θ}}}{\sqrt{2 π θ}} {(2 θ Λ_{0} (τ) + 1)}^{- \frac{n_{j} - 1 / 2}{2}} \dots \\ \times k_{(n_{j} - 1 / 2)} [\frac{\sqrt{(1 + 2 θ Λ_{0} (τ))}}{θ}] \end{matrix}

where $k_{(n_{j} - 1 / 2)} [.]$ is a modified bessel function of the second kind and $\int_{0}^{\infty} x^{s - 1} ex p^{(a x^{h} - b x^{- h})} dx = \frac{2}{h} (\frac{b}{a})^{(\frac{s}{2 h})} k_{(\frac{s}{h})} [2 \sqrt{ab}]$ (see Shoukri et al.⁵⁴).

The total likelihood function for all the components is given by:

(32)

\begin{matrix} L = Π_{j = 1}^{m} L_{j (marg)} \\ = Π_{j = 1}^{m} \frac{(Π_{i = 1}^{n_{j}} λ_{0} (t_{ij})) 2 ex p^{\frac{1}{θ}}}{\sqrt{2 π θ}} {(2 θ Λ_{0} (τ) + 1)}^{- \frac{n_{j} - 1 / 2}{2}} \dots \\ \times k_{(n_{j} - 1 / 2)} [\frac{\sqrt{(1 + 2 θ Λ_{0} (τ))}}{θ}] \end{matrix}

Taking the logarithm of the likelihood function of equation (32) results in:

(33)

\begin{matrix} l = \sum_{j = 1}^{m} (\sum_{i = 1}^{n_{j}} \log (λ_{0} (t_{ij}))) + \sum_{j = 1}^{m} \log (2 ex p^{\frac{1}{θ}} \frac{1}{\sqrt{2 π θ}} (1 \dots \\ + 2 θ Λ_{0} (τ))^{- \frac{n_{j} - 1 / 2}{2}} k_{(\frac{n_{j} - 1 / 2}{2})} [\frac{\sqrt{(1 + 2 θ Λ_{0} (τ))}}{θ}] \end{matrix}

Appendix B

The expressions for the posterior distribution, expected frailty value, expected number of failures and marginal reliability function for the IG frailty model is derived as follows.

The marginal reliability function of the $j^{th}$ component is given as:

(34)

R_{j} (t) = \int_{0}^{\infty} R_{j} (t | z_{j}) f (z_{j}; θ) d z_{j},

where $f (z_{j}; θ)$ is the IG distribution and $R_{j} (t | z_{j}) = ex p^{- z_{j} Λ_{0} (t)} .$ Thus

(35)

R_{j} (t) = \int_{0}^{\infty} \frac{1}{\sqrt{2 π θ}} z_{j}^{- \frac{3}{2}} ex p^{(- z_{j} Λ_{0} (t) - \frac{{(z_{j} - 1)}^{2}}{2 z_{j} θ})} d z_{j},

and

(36)

R_{j} (t) = ex p^{(\frac{1}{θ} (1 - \sqrt{1 + 2 θ Λ_{0} (t)}))},

where $R_{j} (t)$ is derived by replacing $s$ with $Λ_{0} (t)$ in the Laplace transform for gamma distribution $L_{j} [s] = ex p^{(\frac{1}{θ} (1 - \sqrt{1 + 2 θ s}))}$ (See Mundo et al.⁵⁵ for derivation of Laplace transform of IG and Gamma frailty models).

The marginal expected number of failures of the $j^{th}$ component between time $t_{1}$ and $t_{2}$ is given as:

(37)

\begin{matrix} E [N_{j} (t_{1}, t_{2})] = Λ_{j} (t_{1}, t_{2}) = - \log (R_{j} (t_{1}, t_{2})) \\ = \frac{1}{θ} (\sqrt{1 + 2 θ Λ_{0} (t_{1}, t_{2})} - 1) . \end{matrix}

The marginal expected number of component failures in the system between time $t_{1}$ and $t_{2}$ is given as:

(38)

\begin{matrix} E [N_{S} (t_{1}, t_{2})] \\ = \sum_{j = 1}^{m} E [N_{j} (t_{1}, t_{2})] = \frac{m}{θ} (\sqrt{1 + 2 θ Λ_{0} (t_{1}, t_{2})} - 1) . \end{matrix}

where m is the number of components in the system.

Next we derive the posterior distribution for $z_{j}^{*}$ given a IG prior distribution. Replacing the expressions of $L_{j (marg)}$ , $L_{j} (λ_{0} (t_{ij}) | z_{j})$ and $f (z_{j}; θ)$ in the Bayes formula Eq(26), then the posterior distribution of $z_{j}^{*}$ for the $j^{th}$ component will be derived as:

(39)

\begin{matrix} h_{j} (z_{j}^{*} | θ) & = \frac{z_{j}^{n_{j} - \frac{3}{2}} ex p^{(- z_{j} \frac{(2 θ Λ_{0} (t) + 1)}{2 θ} - \frac{1}{^{2 θ z_{j}}})}}{2 {(2 θ Λ_{0} (t) + 1)}^{- (\frac{n_{j} - 1 / 2}{2})} k_{(n_{j} - 1 / 2)} [\frac{\sqrt{(1 + 2 θ Λ_{0} (τ))}}{θ}]} . \end{matrix}

The value of the frailty term for component $j$ is then derived as a point estimate by finding the mean of the updated frailty distribution $h_{j} (z_{j}^{*} | θ)$ . The mean of the updated frailty distribution $\bar{z_{j}^{*}} = E [z_{j}^{*}]$ is derived as:

(40)

\begin{matrix} E [z_{j}^{*}] = \int_{0}^{\infty} z_{j}^{*} h_{j} (z_{j}^{*} | θ) {dz}_{j}^{*} . \end{matrix}

Then

(41)

\begin{matrix} E [z_{j}^{*}] = \frac{\int_{0}^{\infty} z_{j}^{*} . z {_{j}^{*}}^{n_{j} - \frac{3}{2}} ex p^{(- z_{j}^{*} \frac{(2 θ Λ_{0} (t) + 1)}{2 θ} - \frac{1}{^{2 θ z_{j}}})} {dz}_{j}^{*}}{2 {(2 θ Λ_{0} (t) + 1)}^{- (\frac{n_{j} - 1 / 2}{2})} k_{(n_{j} - 1 / 2)} [\frac{\sqrt{(1 + 2 θ Λ_{0} (τ))}}{θ}]}, \end{matrix}

and

(42)

\begin{matrix} E [z_{j}^{*}] = \frac{{(2 θ Λ_{0} (t) + 1)}^{- (\frac{n_{j} + 1 / 2}{2})} k_{(n_{j} + 1 / 2)} [\frac{\sqrt{(1 + 2 θ Λ_{0} (τ))}}{θ}]}{{(2 θ Λ_{0} (t) + 1)}^{- (\frac{n_{j} - 1 / 2}{2})} k_{(n_{j} - 1 / 2)} [\frac{\sqrt{(1 + 2 θ Λ_{0} (τ))}}{θ}]}, \end{matrix}

where $n_{j}$ is the number of events that component $j$ has experienced. $θ$ is the estimated variance parameter from the gamma frailty likelihood estimator, and $Λ_{0} (t)$ is the cumulative intensity of the component until time $t$ . $k_{(n_{j} - 1 / 2)} [.]$ is a modified Bessel function of the second kind.

Appendix C

The expressions for the posterior distribution, expected frailty value, expected number of failures and marginal reliability function for the gamma frailty model is derived as follows.

The marginal reliability function of the $j^{th}$ component is given as:

(43)

R_{j} (t) = \int_{0}^{\infty} R_{j} (t | z_{j}) f (z_{j}; θ) d z_{j},

where $f (z_{j}; θ) = \frac{z_{j}^{\frac{1}{θ} - 1} ex p^{- \frac{z_{j}}{θ}}}{θ^{\frac{1}{θ}} Γ (\frac{1}{θ})}$ is a gamma distribution and $R_{j} (t | z_{j}) = ex p^{- z_{j} Λ_{0} (t)} .$

Thus,

(44)

R_{j} (t) = \int_{0}^{\infty} ex p^{- (z_{j} Λ_{0} (t))} \frac{z_{j}^{\frac{1}{θ} - 1} ex p^{- \frac{z_{j}}{θ}}}{θ^{\frac{1}{θ}} Γ (\frac{1}{θ})} d z_{j},

and

(45)

R_{j} (t) = (θ Λ_{0} (t) + 1)^{- \frac{1}{θ}},

where $R_{j} (t)$ is derived by replacing $s$ with $Λ_{0} (t)$ in the Laplace transform for gamma distribution $L_{j} [s] = (θ s + 1)^{- \frac{1}{θ}}$ .

The expected number of failures of the $j^{th}$ component between time $t_{1}$ and $t_{2}$ can be derived from the marginal reliability as:

(46)

\begin{matrix} E [N_{j} (t_{1}, t_{2})] = Λ_{j} (t_{1}, t_{2}) = - \log (R_{j} (t_{1}, t_{2})) \\ = \frac{1}{θ} \log {(θ Λ_{0} (t_{1}, t_{2}) + 1)}^{- \frac{1}{θ}} \end{matrix}

The expected number of component failures in the system between time $t_{1}$ and $t_{2}$ is given as:

(47)

E [N_{S} (t_{1}, t_{2})] = \sum_{j = 1}^{m} E [N_{j} (t_{1}, t_{2})] = \frac{m}{θ} \log (θ Λ_{0} (t_{1}, t_{2}) + 1)^{- \frac{1}{θ}} .

where m is the number of components in the system.

Next we derive the posterior distribution for $z_{j}^{*}$ given a gamma prior distribution. For gamma frailty model, the expression of $L_{j (marg)}$ is given by Asfaw and Lindqvist²:

(48)

L_{j (marg)} = [\frac{Π_{i = 1}^{n_{j}} λ_{0} (t_{ij}) Γ (n_{j} + \frac{1}{θ})}{θ^{\frac{1}{θ}} Γ (\frac{1}{θ}) {(Λ_{0} (t) + \frac{1}{θ})}^{(n_{j} + \frac{1}{θ})}}]

Replacing the expressions of $L_{j (marg)}$ , $L_{j} (λ_{0} (t_{ij}) | z_{j})$ , and $f (z_{j}; θ)$ for the gamma frailty model in the Bayes formula equation (26), then the posterior distribution of $z_{j}^{*}$ for the $j^{th}$ component will be derived as:

(49)

\begin{matrix} h_{j} (z_{j}^{*} | θ) = \frac{z_{j}^{n_{j} + \frac{1}{θ} - 1} ex p^{- z_{j} (Λ_{0} (t) + \frac{1}{θ})} {(Λ_{0} (t) + \frac{1}{θ})}^{n_{j} + \frac{1}{θ}}}{Γ (n_{j} + \frac{1}{θ})} . \end{matrix}

The expression for the mean of the updated frailty distribution $\bar{z_{j}^{*}} = E [z_{j}^{*}]$ is given as:

(50)

\begin{matrix} E [z_{j}^{*}] = \frac{\int_{0}^{\infty} z_{j}^{*} . z {_{j}^{*}}^{n_{j} + \frac{1}{θ} - 1} ex p^{- z_{j}^{*} (Λ_{0} (t) + \frac{1}{θ})} {(Λ_{0} (t) + \frac{1}{θ})}^{n_{j} + \frac{1}{θ}} {dz}_{j}^{*}}{Γ (n_{j} + \frac{1}{θ})} . \end{matrix}

However, $h_{j} (z_{j}^{*} | θ)$ is a gamma density function, thus the associated posterior mean is:

(51)

\begin{matrix} E [z_{j}^{*}] = \frac{(n_{j} + \frac{1}{θ})}{(Λ_{0} (t) + \frac{1}{θ})}, \end{matrix}

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Matthew Revie

References

Ascher

Feingold

. Repairable systems reliability: modeling, inference, misconceptions and their causes. New York: M. Dekker, 1984.

Asfaw

Lindqvist

. Unobserved heterogeneity in the power law nonhomogeneous Poisson process. Reliab Eng Syst Saf 2015; 134: 59–65.

Slimacek

Lindqvist

. Nonhomogeneous Poisson process with nonparametric frailty. Reliab Eng Syst Saf 2016; 149: 14–23.

Chahkandi

Ahmadi

Baratpour

. Some results for repairable systems with minimal repairs. Appl Stoch Models Bus Ind 2014; 30(2): 218–226.

Dias

Oliveira

Colosimo

Gilardoni

. Power law selection model for repairable systems. Commun Stat Theory Methods 2013; 42(4): 570–578.

Arasan

Ehsani

. Modeling repairable system failures with interval failure data and time dependent covariate. J Mod Appl Stat Methods 2011; 10(2): 607–617.

Wang

. Log-linear process modeling for repairable systems with time trends and its applications in reliability assessment of numerically controlled machine tools. Proc IMechE, Part O: J Risk and Reliability 2013; 227(1): 55–65.

Karaömer

Chouseinoglou

. Comparison of non-homogeneous Poisson process software reliability models in web applications. AJIT-e: Bilişim Teknolojileri Online Dergisi 2016; 7(24): 7–28.

Guarnaccia

Quartieri

Barrios

, et al. Modeling environmental noise exceedances using non-homogeneous Poisson processes. J Acoust Soc Am 2014; 136(4): 1631–1639.

10.

Lindqvist

. Statistical modeling and analysis of repairable systems. In: Ionescu

Limnios

(eds) Statistical and probabilistic models in reliability. New York: Springer, 1999, pp.3–25.

11.

Deep

Veeramani

Zhou

. Event prediction for individual unit based on recurrent event data collected in teleservice systems. IEEE Trans Reliab 2020; 69(1): 216–227.

12.

Hussain

Naikan

. Point process based maintenance modeling for repairable systems: a review. In Proceedings of the 2010 international conference on industrial engineering and operations management, Dhaka, Bangladesh, 2010.

13.

Slimacek

Lindqvist

. Nonhomogeneous Poisson process with nonparametric frailty and covariates. Reliab Eng Syst Saf 2017; 167: 75–83.

14.

Darmofal

. Bayesian spatial survival models for political event processes. Am J Pol Sci 2009; 53(1): 241–257.

15.

Ghadimi

Mahmoodi

Mohammad

, et al. Factors affecting survival of patients with oesophageal cancer: a study using inverse Gaussian frailty models. Singapore Med J 2012; 53(5): 336–343.

16.

Vaupel

Manton

Stallard

. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography 1979; 16(3): 439–454.

17.

Kheiri

Kimber

Reza Meshkani

. Bayesian analysis of an inverse Gaussian correlated frailty model. Comput Stat Data Anal 2007; 51(11): 5317–5326.

18.

Hanagal

. Modeling survival data using frailty models. New York: Springer, 2011.

19.

Zaki

Barabadi

Barabady

, et al. Observed and unobserved heterogeneity in failure data analysis. Proc IMechE, Part O: J Risk and Reliability 2022; 236: 194–207.

20.

Zaki

Barabadi

Qarahasanlou

, et al. A mixture frailty model for maintainability analysis of mechanical components: a case study. Int J Syst Assur Eng Manag 2019; 10(6): 1646–1653.

21.

Ghomghale

Ataei

Khalokakaie

, et al. The application of frailty model in remaining useful life estimation (case study: Sungun copper mine’s loading system). JModel Eng 2020; 18(62): 129–142.

22.

Barabadi

Ataei

Khalokakaie

, et al. Spare-part management in a heterogeneous environment. PLoS One 2021; 16(3): e0247650.

23.

Lin

AspLund

. Comparison study of heavy haul locomotive wheels running surfaces wearing. Eksploatacja i Niezawodność 2014; 16(2): 276–287.

24.

Lin

Asplund

. Bayesian semi-parametric analysis for locomotive wheel degradation using gamma frailties. Proc IMechE, Part F: J Rail and Rapid Transit 2015; 229(3): 237–247.

25.

Lin

Pulido

Asplund

. Reliability analysis for preventive maintenance based on classical and Bayesian semi-parametric degradation approaches using locomotive wheel-sets as a case study. Reliab Eng Syst Saf 2015; 134(3): 143–156.

26.

Bain

Wright

. The negative binomial process with applications to reliability. J Qual Technol 1982; 14(2): 60–66.

27.

Engelhardt

. Statistical analysis of a compound power-law model for repairable systems. IEEE Trans Reliab 1987; R-36(4): 392–396.

28.

Lindqvist

Elvebakk

Heggland

. The trend-renewal process for statistical analysis of repairable systems. Technometrics 2003; 45(1): 31–44.

29.

D’Andrea

Feitosa

Tomazella

, et al. Frailty modeling for repairable systems with Minimum Repair: An application to dump truck data of a Brazilian Mining Company. J Math Stat Sci 2017; 2017(6): 179–198.

30.

Slimacek

Lindqvist

. Reliability of wind turbines modeled by a poisson process with covariates, unobserved heterogeneity and seasonality. Wind Energy 2016; 19(11): 1991–2002.

31.

Hougaard

. Life table methods for heterogeneous populations: distributions describing the heterogeneity. Biometrika 1984; 71(1): 75–83.

32.

Chhikara

Folks

. The inverse Gaussian distribution: theory, methodology, and applications. New York: M. Dekker, 1989.

33.

Piancastelli

LSC

Barreto-Souza

Mayrink

. Generalized inverse-Gaussian frailty models with application to target neuroblastoma data. Ann Inst Stat Math 2021; 73(5): 979–1010.

34.

Hanagal

Sharma

. Analysis of bivariate survival data using shared inverse Gaussian frailty model. Commun Stat Theory Methods 2015; 44(7): 1351–1380.

35.

Onchere

. Frailty models applications in pension schemes. PhD Thesis, University of Nairobi, 2013.

36.

Aalen

Borgan

Gjessing

. Survival and event history analysis: a process point of view. New York: Springer Science & Business Media, 2008.

37.

Lawless

. Regression methods for Poisson process data. J Am Stat Assoc 1987; 82(399): 808–815.

38.

Hsu

Gorfine

Malone

. On robustness of marginal regression coefficient estimates and hazard functions in multivariate survival analysis of family data when the frailty distribution is mis-specified. Stat Med 2007; 26(25): 4657–4678.

39.

Jahani

Zhou

Veeramani

, et al. Multioutput Gaussian process modulated Poisson processes for event prediction. IEEE Trans Reliab 2021; 70(4): 1569–1580.

40.

Nafisah

Shrahili

Alotaibi

, et al. Virtual series-system models of imperfect repair. Reliab Eng Syst Saf 2019; 188: 604–613.

41.

Rausand

Hoyland

. System reliability theory: models, statistical methods, and applications. Vol. 396. Hoboken, NJ: John Wiley & Sons, 2003.

42.

Percy

Alkali

. Generalized proportional intensities models for repairable systems. IMA J Manag Math 2006; 17(2): 171–185.

43.

Tessema

Ayalew

Mohammed

. Modeling the determinants of time-to-age at first marriage in Ethiopian women: a comparison of various parametric shared frailty models. Sci J Public Health 2015; 3(5): 707–718.

44.

Ata Tutkun

Marthin

. A comparative study with bootstrap resampling technique to uncover behavior of unconditional hazards and survival functions for gamma and inverse gaussian frailty models. Math Sci 2021; 15(1): 99–109.

45.

Zhang

Revie

. Model selection with application to gamma process and inverse gaussian process. London: CRC/Taylor & Francis Group, 2016.

46.

Brewer

Butler

Cooksley

. The relative performance of aic, aicc and bic in the presence of unobserved heterogeneity. Methods Ecol Evol 2016; 7(6): 679–692.

47.

Izumi

Ohtaki

. Aspects of the armitage–doll gamma frailty model for cancer incidence data. Environmetrics 2004; 15(3): 209–218.

48.

Bishop

Nasrabadi

. Pattern recognition and machine learning. Vol. 4. New York: Springer, 2006.

49.

Banbeta

Seyoum

Belachew

, et al. Modeling time-to-cure from severe acute malnutrition: application of various parametric frailty models. Arch Public Health 2015; 73(1): 6–8.

50.

Adham

AlAhmadi

. Gamma and inverse Gaussian frailty models: a comparative study. J Math Stat Invent 2016; 4(4): 4101–4105.

51.

Loungani

. How accurate are private sector forecasts? Cross-country evidence from consensus forecasts of output growth. Int J Forecast 2001; 17(3): 419–432.

52.

Carlin

Louis

. Empirical bayes: past, present and future. J Am Stat Assoc 2000; 95(452): 1286–1289.

53.

Gebraeel

Elwany

Pan

. Residual life predictions in the absence of prior degradation knowledge. IEEE Trans Reliab 2009; 58(1): 106–117.

54.

Shoukri

Asyali

VanDorp

, et al. The Poisson inverse Gaussian regression model in the analysis of clustered counts data. J Data Sci 2004; 2(1): 17–32.

55.

Munda

Rotolo

Legrand

. Parfm: parametric frailty models in R. J Stat Softw 2012; 51: 1–20.

Reliability evaluation of repairable systems considering component heterogeneity using frailty model

Abstract

Keywords

Introduction

System description

IG frailty model

Maximum likelihood for IG frailty model

Simulation study

Simulation design

Evaluation of IG estimator

Mis-specification study of gamma and IG frailty models

Case One: Low heterogeneity

Case Two: High heterogeneity

Case Three: Component early life behaviour

Case Four: Component mid-life behaviour

Summary of analysis

Application to classic dataset

Conclusion

Footnotes

Appendix A

Appendix B

Appendix C

Declaration of conflicting interests

Funding

ORCID iD

References