Sage Journals: Discover world-class research

Abstract

The successful implementation of Structural Health Monitoring (SHM) systems is confined to the capability of evaluating their performance, reliability, and durability. Although there are many SHM techniques capable of detecting, locating and quantifying damage in several types of structures, their certification process is still limited. Despite the effort of academia and industry in defining methodologies for the performance assessment of such systems in recent years, many challenges remain to be solved. Methodologies used in Non-Destructive Evaluation (NDE) have been taken as a starting point to develop the required metrics for SHM, such as Probability of Detection (POD) curves. However, the transposition of such methodologies to SHM is anything but straightforward because additional factors should be considered. The time dependency of the data, the larger amount of variability sources and the complexity of the structures to be monitored exacerbate/aggravate the existing challenges, suggesting that much work has still to be done in SHM. The article focuses on the current challenges and barriers preventing the development of proper reliability metrics for SHM, analyzing the main differences with respect to POD methodologies for NDE. It was found that the development of POD curves for SHM systems requires a higher level of statistical expertise and their use in the literature is still limited to few studies. Finally, the discussion extends beyond POD curves towards new metrics such as Probability of Localization (POL) and Probability of Sizing (POS) curves, reflecting the diagnosis paradigm of SHM.

Keywords

structural health monitoring reliability probability of detection probability of localization probability of sizing MAPOD metamodels sequential data analysis

Introduction

The concept of inspection is fundamentally different from the concept of monitoring according to Derriso et al. in three main aspects: the evaluation frequency, the use of previous system outcomes, and the decision range which is possible exploiting the evaluation process results.¹ Therefore, while inspections are conceived to provide a go/no go evaluation related to the health of a structural component, monitoring offers the possibility to take multiple actions thanks to the higher amount of available information.

Farrar and Worden defined SHM as the process of implementing a damage identification strategy for aerospace, civil and mechanical engineering infrastructure.² Other definitions are available in the literature,^3–6 all have the common goal of switching from the current scheduled maintenance philosophy to a condition-based maintenance approach.

Condition-based maintenance empowered by SHM can reduce maintenance costs, inspection time⁷ and downtime.⁴ The reduced labor requirement of SHM can lead to an increase in safety^4,8 compared to manual inspections, not only for the personnel, but also for the structure itself which may be accidentally damaged during inspections.⁹ For difficult-to-reach area, SHM offers the possibility to overcome the accessibility limitations by permanently installed sensors.¹⁰ The military industry sees SHM as an opportunity to increase the combat asset readiness.¹¹ The examples of other benefits of SHM are the early detection of damage during normal operational conditions and a drastic decrease of the human factor.⁷

In the recent years, the usage of composites has been increasing. However, the complexity of such materials and the presence of a multitude of different possible damage mechanisms still force engineers to use a conservative design approach.⁶ The availability of online monitoring data provided by the SHM may enhance the understanding of the new materials and thus leave room to more innovative and closer-to-the-limit design. The reduction of structural design margins can lead to lighter structures. If the structural weight reduction is higher than the additional weight of the monitoring system (sensors, cables, and electronics), lower fuel consumption thus lower CO₂ emissions as well as wider design range are expected.⁴

A Sandia National Labs report written by Roach in 2011 stated that the Technology Readiness Level (TRL) of SHM systems did not go beyond TRL 8 and the majority were concentrated at TRL 4.¹² In 2013 Seaver et al.¹³ presented a classification of different sensing technologies based on their TRL. At that time, the TRL was ranging from 3 to 9 depending on the specific application. In recent years, several technologies based on ultrasonic permanently installed sensors such as guided wave monitoring and point thickness measurements became commercially successful.¹⁴ However, there are still barriers preventing a complete transition toward SHM. In 2018 Cawley addressed the main reasons of this unsatisfying rate of transition from NDE towards practical applications of SHM.¹⁴ The lack of specific techniques for performance validation, regarding both damage detection and its corresponding false call rate, was identified as a critical point preventing the widespread of SHM. The need of performance validation was also outlined in a recent publication of the same author.¹⁵ The MIL-HKBK-1823A¹⁶ allows the assessment of NDE methods exploiting the concept of Probability of Detection (POD) curves. However, there is a lack of specific guidelines and procedures to evaluate the system monitoring capabilities in the field of SHM.⁹

The awareness in the SHM community about the topic of POD curves is still limited. Figure 1 shows the number of publications with the keywords “SHM” and “SHM+POD” since 1995. Despite increasing attention to SHM, only few studies were related to POD curves.

Figure 1.

Trend of the keywords “SHM” and “SHM+POD” analyzing the publications selected from Web of Science Core Collection (May 2021).

The establishment of common certification criteria is fundamental to the application of SHM technologies,^17–21 and has the potential of improving the design of the system itself.²² According to Aldrin et al. the qualification of SHM technologies should be based on already present guidelines,²³ such as: cost–benefit analysis (CBA),²⁴ materials and structure certification, NDE metrics (i.e., POD curves),¹⁶ and procedures for performing a Failure Mode, Effects, and Criticality Analysis (FMECA).²⁵ In 2011 Aldrin et at. Formulated a protocol,²⁶ mainly based on the already existing MIL-HKBK-1823A.¹⁶ One year later, this protocol was applied to a real case study, with promising results.¹⁸ Kessler examined three validation standards,^27–29 already utilized in the aeronautical sector, in an attempt of identifying potential relationships with SHM applications.³⁰

The scientific question arising from these preliminary considerations is when it is possible to apply standard POD curves for SHM systems. SHM systems can be classified in four categories.³¹ First, it is possible to distinguish scheduled SHM (S-SHM) systems, from automatic SHM (A-SHM).³ In this case the classification is done according to the way sensor data are collected, scheduled time intervals in the former and continuously in the latter. Second, the damage location can be known (KDL) or unknown (UDL), providing another criterion to further classify an SHM system. According to Janapati et al., only the KDL S-SHM could be evaluated using the standard tools of NDE methods such as POD curves.³¹ However, the employment of A-SHM has an increasing trend in the SHM community and it is important to being capable of deriving POD curves even for such cases.

Another fundamental aspect is the need of additional metrics to evaluate the reliability of the system also in terms of damage localization and characterization.^18,19,23,26 The Model-Assisted Probabilistic Reliability Assessment (MAPRA) methodology follows this line of reasoning.¹⁹ Kabban and Derriso state that, in the perspective of developing a statistical framework for the certification of SHM systems, the system accuracy and reliability should be assessed with respect to three main points: (i) the capability to determine the presence of the damage (detection problem already common in NDE), (ii) the ability to assess the extent, and (iii) the location of the damage.²² These additional metrics would find their natural allocation within the paradigm of the SHM phases (detection, localization, assessment, prognosis), initially proposed by Rytter in 1993,³² and successively chosen as a reference in the field.^33,11,31

It is interesting to conclude the introduction topic of SHM reliability evaluation with a philosophical question. Could it be necessary to rethink the current regulations and develop new reliability metrics better suited for the advancement of SHM technology? Derriso et at. conceptualized the Cognitive Architecture for State Exploitation (CASE), which resembles the human cognitive behavior and aims to exploit the full potential of a SHM technology, making use of its higher levels (i.e., health management of the full system).^1,34 Despite the CASE approach demonstrated to be more effective in terms of down time costs with respect the Aircraft Structural Integrity Programs (ASIP) philosophy,³⁵ its full potential cannot be exploited because many of its functionalities should be removed to fulfill the guidelines given in the MIL-HDBK-1823A.

The purpose of this paper is to provide a systematic review of the existing reliability methods in SHM, highlighting the current challenges and areas where further investigation is required. Most of the attempts to quantify the reliability of SHM systems stem from regulations already present in the NDE field. Therefore, it is important to understand the existing guidelines and the basis to transfer the same concept toward SHM.

This paper is organized as follows. Reliability assessment in non-destructive evaluation section reviews the statistics behind the POD development in NDE. Variability sources in structural health monitoring section examines the variability sources in SHM and their spatial and temporal implications. Probability of detections for structural health monitoring section reviews different statistical models to produce POD curves in SHM. Multivariate-probability of detection section introduces the concept of Multivariate POD using model assisted methods and metamodels. Localization and sizing metrics section discusses a series of localization and sizing metrics used in SHM. Discussion and perspectives section summarizes the main findings of the literature review, examining current challenges and areas where further investigation is required. Finally, in Table A1 (see the Appendix at the end of this article) the reader can find the most relevant case studies analyzed in this article.

Reliability assessment in non-destructive evaluation

The detection problem

Table 1 shows the four possible system outcomes for a detection problem^22,36.

Table 1.

Possible system outcomes combination in a detection problem.

	Presence of damage	Absence of damage
Detection	True positive (TP)	False positive (FP)
Detection	Probability of detection (POD)	Probability of false alarm (PFA)
No detection	False negative (FN)	True negative (TN)
No detection	Probability of false negative (PFN)	Probability of true negative (PTN)

The POD is also often referred to as Positive Predicted Probability (PPP),³⁶ whereas the PFA is sometimes simply called Probability of False Positive (PFP).²² In the same manner, the PTN can be named Negative Predicted Probability (NPP).³⁶ Summing the probability values of each columns in Table 1 always returns the value of one as a direct consequence of set theory.^22,36,37 Exploiting Bayesian conditional probability, it is possible to introduce the concepts of sensitivity and specificity. Calling P (AD) the probability for the structure to be to be healthy (absence of damage), P(PD) the probability for the structure to be not healthy (presence of damage), P (Det) the probability of the system to report a detection, P(NoDet) the probability of the system to do not report a detection, one has

POD = P (Det|PD) = " sensitivity "

(1)

PFA = 1 - P (NoDet|AD) = " 1 - specificity "

(2)

The POD and PFA are useful in the design phase and to assess the reliability of the measuring system. On the other hand, under operational conditions it can be useful to refer to the other two probabilities: the Positive and Negative Predictive Values (PPV) and (NPV).²² The engineer can use the PPV and the NPV to determine the conditional probability that a certain damage is present given that it was detected, which is crucial to take the right choice in terms of maintenance

PPV = P (PD|Det) = \frac{P (Det|PD) \cdot P (PD)}{P (Det)} = \frac{POD \cdot P (PD)}{POD \cdot P (PD) + PFA \cdot P (AD)}

(3)

NPV = P (AD|NoDet) = \frac{P (NoDet|AD) \cdot P (AD)}{P (NoDet)} = \frac{(1 - PFA) \cdot P (AD)}{(1 - PFA) \cdot P (AD)) + (1 - POD) \cdot P (PD)}

(4)

The comparison of equations (1) and (2) with equations (3) and (4) shows that while POD and PFA depends only on the inspection methodology, the PPV, and the NPV depend also on the prevalence. The prevalence is the likelihood of structural damage being present. In a low-prevalence scenario, the PPV can be relatively low even if the POD is high.^22,38

General considerations on probability of detection curves

In 1989, the studies of Berens³⁹ were included in the American Society of Metals (ASM) Handbook. Nowadays, the methodology to derive a POD is thoroughly described in the Appendix G of the MIL-HKBK-1823A.¹⁶ The POD is a powerful tool that allows researchers to compare the performance of different monitoring techniques,⁴⁰ estimating the sensitivity and reliability of the inspection process.⁴¹ Other definitions are available in the literature.^42–44

The United States Air Force (USAF), within the ASIP, employs POD curves to assess the reliability of various NDE methods.⁴⁵ In the aerospace industry, the POD curve can be exploited to perform risk analyses, to schedule inspections, to estimate the remaining useful life of a certain component, and to develop accept/reject criteria.⁴⁶ POD are also becoming attractive for other fields where they were traditionally less popular such as the nuclear industry.⁴⁷

In a POD study it is crucial to have enough data available.⁴⁸ A good practice would be to have at least a number of 40 data points.^16,49 Annis et al. investigated what should be the optimum sample size in a POD study and found that beyond 60 samples the improvement on the confidence bounds was less significant.⁵⁰ Further details can be found in the paper of Gandossi and Annis.⁵¹ The minimum number of samples also depends on the POD model type. When logistic models are used, at least 60 samples are needed to avoid instabilities.⁵² In Koh and Meeker,⁵³ it is presented a statistical procedure to plan a POD study introducing a dimensionless standardized flaw-size variable. However, since it is often unpractical to produce a statistically significant number of specimens, much effort has been devoted to develop strategies capable to reduce the required data.

â versus a method

The “â vs a” analysis is the most widely accepted method for the POD assessment in the NDE field. The symbol “â” is used to identify the measurement output whereas the “a” parameter denotes the damage size (i.e., crack length) responsible of generating that measurement signal.¹⁶

From regression to probability of detection curve

Engineers are usually familiar with the Ordinary Least-Squares (OLS) linear regression. However, when dealing with censored data, the OLS would provide non-conservative results. In these situations other techniques such as the Maximum-Likelihood Estimation (MLE) method must be considered. In absence of censored data, the MLE method coincides exactly with the OLS regression analysis.⁵⁴ In both cases, the consistency of the models holds if six conditions are verified^52,55.

1. The model must reflect the data.

2. It is required to have a continuous and observable response.

3. The linearity of the parameters must be satisfied.

4. The variance must be homoscedastic (uniform variance) about the regression line.

5. The observations must be uncorrelated (with respect time and/or space).

6. The errors must follow a normal distribution.

In the regression analysis, it is crucial to identify the right â vs a plot. Four possible combinations can be used: (i) â vs a, (ii) â vs log(a), (iii) log(â) vs a, and (iv) log(â) vs log(a).⁵⁶

A good practice would be to plot all the four possible graphs and choose the one with the best fit.⁵² Considering the â vs a case (the same procedure can be extended to the remaining three cases), the two variables are related to each other with the following regression equation

â = β_{0} + β_{1} a + ε

(5)

where β₀ and β₁ represent the regression coefficients of the model and

ε \sim N (0, τ)

is the corresponding error term, which follows a normal distribution having zero mean and a standard deviation equals

τ

.^16,40,49,57 Fitting the model represented in equation (5) one gets

E (â) = \hat{y} = {\hat{β}}_{0} + {\hat{β}}_{1} a

(6)

where E is the expectation operator and

{\hat{β}}_{0}, {\hat{β}}_{1}

are the parameter predictions with their uncertainty (the true parameters

β_{0}, β_{1}

of equation (1) are unknown). Once the regression model parameters are computed, the POD curve, whose value corresponds to the shaded area in Figure 2, is derived as follows

POD (a) = \Pr (\hat{a} > {\hat{a}}_{t h}) = 1 - \Pr (\hat{a} < {\hat{a}}_{t h}) = 1 - Φ_{N o r m} (z)

(7)

Here, ${\hat{a}}_{t h}$ is an arbitrary threshold value selected by the engineer and z is simply given by

z = \frac{{\hat{a}}_{t h} - \hat{y}}{τ} = \frac{{\hat{a}}_{t h} - ({\hat{β}}_{0} + {\hat{β}}_{1} a)}{τ}

(8)

Another way to express the POD is given by equation (9)

POD (a) = Φ_{N o r m} (\frac{a - \hat{μ}}{\hat{σ}})

(9)

where

\hat{µ}

and

\hat{σ}

are referred as the location and shape parameters, respectively

\hat{μ} = \frac{{\hat{a}}_{t h} - {\hat{β}}_{0}}{{\hat{β}}_{1}}; \hat{σ} = \frac{\hat{τ}}{{\hat{β}}_{1}}

(10)

It is possible to identify two points of interest: $a_{90}$ and $a_{90 / 95}$ They are the estimated crack lengths at which the POD, and its corresponding 95% lower bound, equal 90% respectively (see Figure 3).

Choosing the right threshold

Every POD depends on the detection threshold. Arbitrarily lowering ${\hat{a}}_{t h}$ would improve the POD curve but inevitably at the cost of increasing the PFA.⁴⁶ Therefore, it is possible to draw a comparison between POD curves of different inspection methodologies if and only if their PFA is the same.⁴⁹ The PFA can be computed as

PFA (a) = \Pr ({\hat{a}}_{n o i s e} > {\hat{a}}_{t h})

(11)

One could use the so-called Receiver Operating Characteristics (ROC) graph,^58,59 as a tool to determine the best threshold. ${\hat{a}}_{t h}$ ^60,61 The ROC curve is typically used to assess the performance of a classifier and a comprehensive explanation can be found in the papers of Fawcett.^59,62 In the NDE context, the POD depends on both the PFA and the flaw dimensions. Therefore, it is possible to obtain a family of ROC curves selecting a set of different flaw sizes.⁶³ Nevertheless, there are alternative approaches to select the right decision threshold. As suggested by the MIL-HKBK-1823A,¹⁶ the best trade-off between POD and PFA can be achieved plotting the critical crack target sizes (such as $a_{90}$ , $a_{90 / 95}$ ) and the PFA against the decision threshold. ${\hat{a}}_{t h}$ ^16,52

Probability of detection curve bounds

Confidence intervals express the statistical uncertainty due to the fact that only a limited amount of data is available.⁴⁶ The computation of the POD lower bound can be divided into two steps. First, the confidence and the prediction intervals (the latter differs from the former because they also consider the variability of the observations about the predicted mean) are computed using the Wald method. Second, the so-called Delta Method is applied to transfer these confidence intervals to the POD curve. The Delta method can be regarded as a technique for estimating the moments of functions of random variable,⁶⁴ and is applied to compute confidence bounds of non-linear functions.⁶⁵ Details about the math behind the Wald and Delta methods can be found in the MIL-HDBK-1823A.¹⁶

With the aid of synthetic data, Figures 2 and 3 show the result of the â vs a method, and the POD curve with its corresponding lower bound using the Delta Method.

Figure 2.

$\hat{a}$ vs a linear regression (solid line), its 95% Wald confidence (dashed lines) and prediction intervals (dotted lines). Gaussian noise is represented by red dots. The grey shaded areas of the Gaussian curves represent the POD.

Figure 3.

POD curve and its lower 95% Wald confidence bound using the Delta method.

Binary (Hit/Miss) data

Historically, the first method to derive a POD was based on the ratio between the number of defects detected, n, usually cracks, and the total number of defects inspected in the structure, N.¹⁶ Such approach implies an intrinsic tradeoff between the crack length and the POD resolution. Therefore, other methodologies were developed to overcome these statistical deficiencies. When dealing with hit/miss data, the system provides only a qualitative information specifically related to the presence or absence of damage in the structure.³⁶ The underling statistical models are based on Generalized Linear Models (GLS). The idea is to leverage continuous functions bounded in the interval [0,1], such as the logit, probit, cloglog, and loglog functions,¹⁶ and use the maximum likelihood criterion to compute the model parameters.⁶⁶ The interested reader may refer to the following references to delve into POD for hit/miss data and the specific statistical methods to compute the corresponding lower confidence bounds.^16,67–74

Variability sources in structural health monitoring

In a POD study, it is fundamental to capture all the possible variability sources. Incomplete variability considerations can lead to biased POD curve estimates.⁴⁶ Variability is linked to the intrinsic stochastic nature of the phenomenon under examination.⁴⁹ Therefore, increasing the amount of data would only shrink the confidence bounds but not the variability. Li et al. proposed the interesting idea of taking into account the inherent population variability using the 0.05 POD quantile estimate and then computed its lower confidence bound to consider also the uncertainty related to the model parameters.⁷⁵ Referring to a small quantile of the POD curve can be a more appropriate solution when the engineer is interested to consider the worst possible scenario.⁷⁶

In NDE one possible source of variability comes from the morphology of the cracks itself.^77,78 A portion of variability could be associated with the sensing device due to the manufacturing process of the instrument. Environmental conditions such as temperature and humidity may change the signal output of a certain measuring technique.⁴⁹ Finally, the human factor contribution in NDE systems is often considered the highest within all the variability sources.^79,80

Structural health monitoring systems inherit all the variability sources of NDE methods. The manufacturing process (sensors, interrogator, specimens, or test structures)⁴⁹ and damage morphology are examples of variability sources affecting both NDE and SHM systems.⁷⁸ The only exception is the variability associated with the human operator. The SHM system is usually capable of acquiring data automatically even in the areas where typical NDE inspection are unpractical due to complex geometries and accessibility limitations.¹⁰ Nevertheless, stating that SHM systems are not exposed to human variability sources is not completely true. Indeed, the installation of the sensor network on the structure can be regarded as a human related variability source.⁷⁹

Moreover, additional considerations should be considered when performing a POD study in SHM systems. Sensor degradation is one of the main issues. Since the sensor are permanently installed in the structure, they are subjected to degradation over time due to aging and fatigue. Degradation may regard the sensor itself or its coupling with the structure such as the adhesive, welds, and dry couplings, depending on the technique. The system performance can also be affected by changes in the structure itself due to maintenance operations. It has been reported that sensor deterioration over a certain period of time has an impact on the POD curves.^54,81 Environmental and Operational Conditions (EOCs) like temperature, moisture, pressure, and chemical loading, can greatly affect the SHM system response,^82–84 while the same factors are expected to produce minor effects in a NDE system. Other sources of variability can be associated with the loading condition of the structure which can change over time (take-off, cruising, maneuvers, landing, etc.). The reciprocal position between the sensor and the damage is another specific aspect to consider in SHM. When considering POD curves of SHM systems, their relationship with the defect/damage location cannot be neglected because sensor location is a significant source of variability.^31,85 One could produce different PODs depending on the flaw location.⁵⁴ Moreover, there could be changes in the recorded signal response due to the on-board SHM device.⁴⁹ Mandache et al. suggests that in case of a self-powered sensor, where the recorded data is transmitted with a wireless connection to an on-board memory storage device, electromagnetic interference as well as other possible interference with the avionics are potential sources of variability.⁸⁶

Summarizing, while in NDE variability is mainly attributed to the human factor, in SHM it is equally important to consider the spatial (location uncertainty)^31,54,85,87 and temporal (environmental effects)^79,80,88 aspects of POD.

Table 2 highlights the main differences between NDE and SHM in terms of variability sources and thus the uncertainties to be considered in the corresponding mathematical models.

Table 2.

Variability Sources: NDE versus SHM.

	Variability Sources	NDE	SHM
Aging	Sensor degradation	N	Y
Aging	Coupling degradation	N	Y
Damage-related	Damage morphology	YY	YY
Damage-related	Reciprocal sensor/Flaw location	N	YY
EOCs	Loading conditions (stress/Strain)	N	YY
	Chemical loading	Y	YY
	Humidity	Y	YY
	Pressure	Y	YY
	Temperature	Y	YY
Human factor	Data interpretation	Y	Y
	Measurement procedure	YY	N
	Installation process	N	Y
Manufacturing process	Sensors	Y	Y
	Interrogator	Y	Y
	Specimen/Test structure	Y	Y
Data communication and storage	Electromagnetic and radio frequency interferences	N	Y

N: Variability source not present; Y: Variability source present; YY: Dominant source of variability.

The consideration of variability sources in SHM regarding POD curves can be divided into the spatial aspects and the temporal aspects.

Spatial aspects of probability of detection

Optimal sensor placement using probability of detection

The spatial related variability of SHM system has been studied by several authors.^87,89,90 In many cases, the influence of the damage location on the detection performance translates in an Optimal Sensor Placement (OSP) problem.⁸⁷ Indeed, the presence of permanently installed sensors suggests that in SHM POD curves may serve not only as a tool to quantify the system performance but also as a tool for the design of the SHM system itself. OSP has been studied for several years and a wealth of literature has been produced. A recent review article written by Tan and Zhang summarizes the main advancement in OSP.⁹¹ An important reference is the study of Flynn and Todd who were among the first to use the concepts of POD and PFA to develop a framework for OSP.⁹² Similarly, Azarbayejani et al. demonstrated that OSP can be found maximizing the POD.⁹³ Markmiller and Chang used the POD as a design constraint for the OSP of an SHM system aiming to monitor the dynamic response of the structure caused by an impact event.⁹⁴ Mallardo et al. used POD curves to validate the performance of an artificial neural network whose aim was to detect the location of a certain impact in a composite plate and in a composite stiffened panel. The OSP was chosen such as the POD curves related to different sensor combinations were maximised.^95,96 Yan et al. used a model-assisted POD approach to validate the performance of different sensor configurations.⁹⁷ Chen et al. leveraged POD curves to determine the optimum Lamb wave driving frequency to detect fatigue crack growth in a metallic specimen.⁹⁸ Grooteman used POD as an objective function to obtain the OSP for optical fibers applied to a stiffened composite panel.⁹⁹ Tabjula et al. used outlier analysis to minimize the number of sensing points in a Guided Lamb Wave (GLW) study and POD were employed to quantify the system performance.¹⁰⁰

Specimens versus test structures

In NDE, a POD study is carried out testing several specimens. Similarly, in SHM one should employ a certain number of identical structures, which are the equivalent of the specimens in the NDE study. However, this makes an already expensive procedure even more difficult to apply. Identical structures in SHM requires identical sensing systems. Even though this is possible theoretically, Müller et al. showed that this is not feasible in practice.⁷⁹ This is the consequence of the amount of variability involved in the manufacturing process and sensors installation. Therefore, the POD curve will apply only for that specific structure and that specific sensing network configuration which was utilized to perform the POD study itself. For these reasons, the procedure becomes tremendously costly and time-consuming. Liu and Chang, in a US patent assigned to Acellent Technologies, propose to mimic the damage by bonding stiff metal or damping patches in the structure to create a POD database for a large structure.¹⁰¹ However, the authors also stated that using real damages to produce these POD curves may lead to more accurate results. The introduction of real damages implies that the damaged structure might not be reusable, which increases the time and cost.

Decision threshold for structural health monitoring systems

Sometimes the damaged data may not be available or if available it might not be statistically relevant. In such cases, the threshold can be chosen by exploiting algorithms developed for unsupervised learning problems.¹¹ A multitude of methods are available in the literature within the field of novelty detection. Some methodologies require that the feature vector is normally distributed, such as outlier analysis.¹⁰² On the other hand, other approaches such as extreme value statistics ¹⁰³ can be used without the normal assumption to determine the best threshold value. A prominent reference is the review of Markou and Singh, which summarized the main statistical¹⁰⁴ and neural network based¹⁰⁵ approaches. Due to these additional challenges in determining the proper threshold, Cobb et al. suggests to use an hit/miss approach when dealing with SHM system.¹⁰⁶ Monaco et al. propose a methodology to evaluate the threshold level on a SHM GLW study based on the statistical analysis of noise.³⁶ In their study, the Kolmogorov–Smirnov test is used to reject the null hypothesis, being the non-Gaussian distribution of the experimental data. The same approach for the selection of the damage threshold has been used by Memmolo et al. in a study concerning the damage detection in a composite plate using a tomography technique based on GLW.¹⁰⁷ In Yue et al.¹⁰⁸ the detection of multiple barely visible impact damage (BVID) in large composite aircraft panels is achieved by outlier analysis using a reference pristine database gathered from simple coupons and mono-stringer panels under a wide range of temperature variation.

Temporal aspects of probability of detection

It is often stated that the major difference between NDE and SHM regarding POD development is that in the first case subsequent inspections are independent whereas in the second case they are correlated. Such statement is not entirely true. It would be more accurate to state that the degree of statistical independence between subsequent measurements is greater in NDE than in SHM. Statistical independence is a property that holds only for random events.

For example, Forsyth showed that the hypothesis of statistical independence is not completely true for repeated inspection related to simulated penetrants and eddy current testing.⁴¹ This does not mean that one should jettison all the theory developed in MIL-HKBK-1823A, which assumes independent inspections. Even though several statistical independent degrees can be present, the hypothesis of independence might be true enough to lead to consistent results.

However, this hypothesis is not valid for SHM systems, where a continuous stream of data is expected to be recorded from the structure. The degree of correlation between measurements separated by a small-time interval cannot be ignored and must be properly handled. This intrinsic dependency of SHM measurements hinders the application of traditional statistical methods to produce POD curves, which is considered the most significant barrier preventing the widespread of the SHM technology.³⁴ In 2008 Shook et al. recognized this problem and developed a mathematical model to derive POD curves in the presence of repeated dependent data.¹⁰⁹ Discarding specific chunks of information might restore data independence but at the expense of compromising the effectiveness of the SHM methodology itself .⁵⁷ For this reason, several studies have been conducted to determine whether it is feasible to generalize the assessment methodology of the POD metric to SHM systems.⁴⁹

In the following section Sequential Data Analysis is introduced, showing how it can allow to deal with slowly evolving spurious signal changes due to EOCs, defect morphology, sensors drift and other kind of variability sources.

Sequential data analysis

As reported in Table 2, EOCs are predominant sources of variability in SHM. They can affect the detection performance of the system and their effect must be considered. In NDE it is possible to make measurements for a certain damage at varying EOCs. However, it is not possible to do the same in SHM. The output of the SHM detection system depends on the whole history of the EOCs. Moreover, this is coupled with the damage evolution, which leads to the need of studying a tremendous number of structures.

For instance, in ultrasonics SHM studies temperature has been reported to be the predominant effect in EOCs.^84,110–112 There are two main methodologies for temperature compensation: the Optimal Baseline Selection (OBS) and the Baseline Signal Stretch (BSS). A good description of the OBS method can be found in the paper of Lu and Michaels¹¹³ whereas the BSS methodology is applied in several references such as Croxford et al.,¹¹⁴ Michaels,¹¹⁵ Clarke et al.,¹¹² Harley and Moura.¹¹⁶ Recently, data-driven methods have been developed for effective temperature compensation of large temperature variation up to 70°C,¹¹⁷ and for anisotropic materials.¹¹⁸

Liu et al. proposed a hybrid approach to tackle the problem of slowly evolving spurious signal changes due to EOCs.⁸⁸ The authors evaluated the signal response of a pipe monitoring system under varying EOCs for the undamaged structure. Then, the damage effect was synthetically superimposed to the undamaged signal. A BSS algorithm was used for temperature compensation, and the baseline subtraction, Singular Value Decomposition (SVD) and Independent Component Analysis (ICA) damage feature extraction methods were compared. The ICA approach proved to the in general the most efficient to produce reliable ROC curves.

In a recent article, Mariani and Cawley summarize other temperature compensation techniques developed in the last decade.¹¹⁹ Among them, the location-specific temperature compensation (LSTC) showed promising results for torsional guided wave signals in pipe monitoring,^120,121 resulting in a patent.¹²² The same authors proposed a change detection algorithm based on the Generalized Likelihood Ratio (GLR)¹²³ applied to data obtained through the LSTC or the OBS methods. Their method proved to be sensitive to departures from the pristine state of the structure. However, the methodology only applies if the no sensor drift is present, which is one of the underlining assumptions of the change detection scheme. As a matter of fact, sensor aging and degradation remain an open challenge in the SHM field. Mariani et al. proposed a new methodology to address sensor drift on a thick copper block specimen.¹²⁴ They exploited the back wall echo ratio to reduce influence of PZT sensors drift. Another recent article of Mariani et al. leveraged causal dilated convolutional neural networks to both compensate EOCs and sensor drift.¹²⁵ Their algorithm, which is an adaptation of WaveNet (a deep neural network for audio waveforms),¹²⁶ outperformed the OBS and BSS approaches.

Probability of detection for structural health monitoring

In this section, three POD methods developed for SHM are presented, the Length at Detection (LaD) method, the Linear Mixed-effect Model (LMM) and the Random Effects Model (REM). These methods do not aim to address time dependent data in the way that a sequential analysis does. However, they provide different frameworks to handle the statistical dependence of the measurements being collected.

The length at detection method

The LaD model solves the dependency of sensor data, which is characteristic of SHM systems, by taking the measurement when the crack/damage is detected the first time. Therefore, there are not repeated measures because only the first crack detection is considered.⁴⁹ Roach et al. in 2007,¹²⁷ and Roach in 2009,¹²⁸ studied the reliability of Comparative Vacuum Monitoring (CVM) in aluminum and steel specimens using the LaD method. Sbarufatti et al. in 2016¹²⁹ and Sbarufatti and Giglio in 2017¹³⁰ applied the LaD methodology for fatigue crack monitoring on the tail boom stringers and fuselage panels of a helicopter using Fiber Bragg Grating (FBG) sensors within a model assisted framework. The crack size recorded in each test corresponds to the one for which a clear and stable detection signal is produced. Therefore, it all boils down to the task of characterizing the probability distribution of the crack lengths at detection and its cumulative distribution function represent the corresponding POD curve.¹³¹ Assuming that the crack population shows a Gaussian distribution

P O D (a) = \Pr (X < a) = Φ_{N o r m} (\frac{a - \bar{x}}{s})

(12)

Similarly, if the crack population has a lognormal distribution, the crack length a in equation (12) is replaced by $\ln a$ .

The variables $\bar{x}$ and $s$ represent the sample mean and standard deviation, respectively. The assumption of a normal or lognormal distribution of the cracks at detection is not always easy to verify and therefore can be considered a limiting factor for this approach. One possibility is to use the so-called Anderson–Darling test.¹³² The assumption of normal or lognormal distributed crack lengths at detection can be rejected if the p-value provided by the test is lower than 0.05, which represents the chosen significance level.¹³¹ Probability plots are another useful tool to test abovementioned assumptions. In this case, data (crack lengths at detection) are plotted against the theoretical normal (or lognormal) distribution. If the data lie approximately in a straight line, then it is possible to state that the population follows that probability distribution. The LaD methodology holds even for other kind of statistical distributions such as the smallest extreme value and Weibull, and the largest extreme value and Fréchet distributions.⁴⁹ Further information about these distributions are available in the Appendix C of reference.¹³³ Figure 4 uses synthetic data to simulate and visualize the working principle of the LaD method.

Figure 4.

LaD method using 10 specimens (shown by different markers). Each specimen has 40 measurements.

The confidence bound can be computed exploiting statistical methods relying on the non-central t distribution,^49,133 or applying the One-Sided Tolerance Interval (OSTI) approach. This methodology was firstly proposed by Roach in the detection of fatigue cracks using CVM. It provides an estimation of the upper bound containing a certain fraction of all measurements in the population with a given confidence level.⁹ The percentage of all measurements and the confidence level are the main factors affecting the result. The former is usually taken equal to 90% whereas the standard for the degree of confidence is 95%. The OSTI approach can provide a reliable analysis with only eight flaws with respect the 51 required in a classic binary data POD.¹³⁴ Using the same symbols found in Roach,¹²⁸ the upper bound for the tolerance interval is given by

T = \bar{x} + K_{n, γ, α} \cdot s

(13)

where T represents the tolerance interval,

\bar{x}

denotes the mean of the detection lengths,

s

is the standard deviation of the detection lengths, and K is the probability factor which takes into account three parameters.¹³⁵ The first parameter is the sample size n, the second is the confidence level

γ

, and the third one is the detection level

α

. The K value can be found in specific tables available in several statistic books, see for example, the work of Krishnamoorthy and Mathew¹³⁶ or Meeker, Hahn, and Escobar.¹³³ The probability factor decreases as the sample size increases, which is consistent with the fact that limited number of measurements is associated with a higher level of uncertainty of the sample mean and variance.⁹ Increasing the desired level of confidence leads to higher K values, which is reasonable. Finally, the higher is the detection level and the higher is K because an increase in the detection level must correspond to higher crack lengths.

Linear mixed-effect model

Kabban et al. developed a statistical model in 2015 to produce POD curve for time-dependent data extending the classical â vs a methodology.⁵⁷ Since the observations are no longer uncorrelated, it is not possible to use OLS or MLE.⁵⁵ One possibility is to rely on generalized least square models (originally developed by Aitken in 1936¹³⁷), which are capable to handle such time dependency .⁵⁷ The second possibility is to use a Linear Mixed-effect Model (LMM). This kind of approach extends classical linear models and are particularly suited for datasets where data are not truly independent. The LMM acronym comes from the fact that the “model” to be fitted is “linear,” and that there is the presence of a “mixed effect”: a random effect (this could be the intercept or the slope) and fixed one which generally describes the expected trend of the data. The synthetic data which were considered in the study of Kabban et al. suggested to apply a LMM with a random intercept.⁵⁷ Therefore, using the same terminology employed by the authors, there is a random intercept term for each experimental unit (EU), which is regarded as the basic primary experimental item used to collect data. Equation (14) explains how this approach translates in mathematical terms

â_{ij} = β_{0} + β_{0 i} + β_{1} a_{i j} + ε_{i j}; i = 1, \dots, n; j = 1, \dots, m

(14)

The term $â_{ij}$ denotes the jth measurement taken from the ith EU. The two fixed regression coefficients are represented by $β_{0}$ and $β_{1}$ , while $a_{i j}$ it is the actual crack dimension. The random intercept term is expressed by $β_{0 i} \sim N (0, ω^{2})$ , which follows a normal distribution with zero mean and $ω^{2}$ variance. Finally, similarly to the conventional â vs a methodology, $ε_{i j} \sim N (0, τ^{2})$ is the error term. From a statistical point of view this makes a relevant difference with respect to the classical model presented in equation (5). Indeed, in the case of equation (14), the variance of the response depends both on the error and the random effect variances. Hence, this model assumes that there is a correlation in the measurement made within the same EU and that there is independence between measurements taken from different EUs. The parameters estimates can be derived from the marginal model which averages all the random effects to return an average response expected value. Such methodology allows to incorporate data correlation into the variance of the marginal model error terms.

Random effects model

This model, sometimes referred to as Repeated Measures Random Effects Model (REM²),¹³⁸ tries to generalize the â vs a analysis described in MIL-HKBK-1823A for SHM systems.⁴⁹ Moreover, it takes a step forward with respect to the LMM developed by Kabban et al.,⁵⁷ in the sense that it considers the possibility to have at the same time a random intercept and a random slope. Every crack-sensor combination will produce a series of data which can be fitted using a line with its own slope and intercept. Therefore, the method computes the joint distribution of these parameters⁴⁹

â_{ij} = β_{0 i} + β_{1 i} (a - \bar{a}) + ε_{i j}; i = 1, \dots, n; j = 1, \dots, m

(15)

The subscript “i” denotes a certain crack-sensor combination, and “j” indicates a specific reading coming from that crack-sensor pair. Therefore, $â_{ij}$ represents the j^th measurement response (for example, it could be a scalar value representing a certain damage index) of the i^th crack-sensor combination. The regression coefficient $β_{0}$ and $β_{1}$ utilized in the conventional â vs a analysis are substituted by $β_{0 i}$ and $β_{1 i}$ , highlighting once again that they are unique for every sensor-damage pair. Analogously, the error term $ε$ , becomes $ε_{i j}$ , which also depends on the specific sensor-damage reading. Moreover, the slope coefficient is not directly multiplied by the flaw length, but it is multiplied by the difference between the crack length and the sample mean of the crack lengths, $\bar{a}$ , related to the entire dataset. The POD formula is given by equation (16)

P O D (a) = \Pr (a > a_{t h}) = 1 - Φ_{N o r m} (z)

(16)

Where the z variable is described by the following equation

z = \frac{a_{th} - [μ_{β_{0}} + μ_{β_{1}} (a - \bar{a})]}{{[σ_{β_{0}}^{2} + {(a - \bar{a})}^{2} σ_{β_{1}}^{2} + 2 (a - \bar{a}) σ_{β_{0}} σ_{β_{1}} ρ + σ_{ε}^{2}]}^{\frac{1}{2}}}

(17)

At the numerator $a_{th}$ is the detection threshold value in the response, while $μ_{β_{0}}$ and $μ_{β_{1}}$ are the mean of the intercepts (referred at the crack size equal to $\bar{a}$ for all the crack/sensor pairs in the dataset) and of the slopes, respectively. At the denominator instead of simply having the standard deviation $τ$ , we an expression containing several terms. Specifically, $σ_{β_{0}}$ , $σ_{β_{1}}$ , $σ_{ε}$ represent the standard deviations of the intercepts (referred at the crack size equal to $\bar{a}$ ), of the slopes, and of the error term (for every flaw/sensor combination), respectively. Finally, $ρ$ is the measure of the correlation existing between the slopes and the intercepts. Figure 5 shows an example of such method with the aid of synthetic data.

Figure 5.

REM method using 10 specimens, each one with 20 measurements. The red dots indicate noise measurements.

The computation of the corresponding lower bounds can be achieved with a MLE approach but Bayesian methods with weekly informative priors are also an option. The interested reader can delve into the work of Meeker et al. for further details about these procedures.⁴⁹

Comparison between length at detection and random effects model methods

In this section are discussed the main differences between the LaD and the REM methods. Since the REM method is a generalization of the LMM approach, the latter is not considered in this analysis.

Both LaD and REM model are valid statistical tools to evaluate POD curves for SHM applications. The LaD approach has been particularly used to evaluate the CVM performance in aerospace structures. Thanks to the Federal Aviation Administration (FAA) research program in SHM started in 2011, recently in the U.S. the use of this approach started to be accepted from the major original equipment manufacturer (OEM) and airline operators such as Boeing and Delta.¹³⁹ Despite the LaD offers a relatively simple approach, it discards some information, thus it does not exploit the full potential of the specific SHM application. It also requires an assumption about the crack length at detection distributions (not always easy to verify) and different distribution choices can lead to significantly different results. On the other hand, the REM uses the whole dataset, and this also implies the model to be more robust against departures from the model assumptions. More important, it can be compatible with a model-assisted approach, which makes it very attractive for future applications. A study made by O'Connor tried to quantify the difference between these two statistical methods.¹⁴⁰ From a qualitative point of view, it is pointed out again the fact that it is difficult to justify the use of a certain distribution (normal or lognormal) in the LaD method. Nevertheless, when few observations are available (less than 10) the LaD method seems to be more appropriate since it may be not possible to fit a 5 parameter REM. Therefore, since the LaD it is also lighter from a computational point of view it might be preferable in certain engineering applications. In reference¹⁴⁰ the two methods are also compared quantitatively using as reference the a₉₀ values computed for different datasets. The LaD seemed to overestimate the a₉₀ when the normal distribution approximation of the crossing lengths was not appropriate. However, this results in a conservative prediction, which can be acceptable from an engineering point of view. The two methods showed comparable results except for situations where the $σ_{β_{1}}$ value was high. The LaD and REM models were derived for single parameter describing the damage, which is in many cases the crack length. However, in real applications more quantities could affect the signal response and hence a vector rather than a scalar value should be considered. In this case, a more complex formulations for these models should be developed.

Previous SHM studies tended to neglect data dependency but now the literature highlights that this is not the right way to proceed. Despite these statistical methods are relatively new in the SHM field, there are already several case studies where they are leveraged. Recently Kessler et al. made use of both LaD and REM approaches to develop POD curves.¹³⁸ The authors used a 4-point bend test to obtain a crack growth starting from an Electrical Discharge Machined (EDM) notch on Aluminum bars. They monitored the crack evolution with a carbon nanotube (CNT) sensor, which have a great potential for aerospace applications.¹⁴¹ A recent study, using the methodology proposed by Meeker,⁴⁹ made use of a Bayesian approach to derive POD curves for different case studies.¹⁴² Typically, in the majority of the SHM systems based on ultrasound techniques, the relationship between the damage index and the damage size is not linear, which goes against the assumption made by Meeker.⁴⁹ However, linearity can be restored applying a logit transformation to the damage index.

Multivariate-probability of detection

One single parameter may be not sufficient to describe the defect characteristics satisfactorily. In evaluating the corrosion present in aircraft structures, Bode et al. derived POD curves as a function of both defect size and percent corrosion.¹⁴³ Lee et al. created a M-POD surface using a multivariate log-logistic regression model based on hit/miss detection, where both the defect length and depth are considered as parameters in a ECT application.¹⁴⁴ In a similar ECT study, Hoppe developed a M-POD as a function of crack length, l, and depth, d, but this time extending the classical â vs a method.¹⁴⁵ In 2012 Aldrin et al., along the lines of the previous work of Hoppe,¹⁴⁵ found that including the crack depth in addition to the crack length was reducing the model uncertainty of about 20%.¹⁴⁶ The same authors leveraged a physic-based model (VIC-3D^©) to consider several parameters and thus to reduce the variability and the required number of samples. Another case study regarding the ECT of fastener sites for fatigue cracks,¹⁴⁷ revealed that the calibrated physic-based model, which considered multiple parameters rather than the simple crack length, performed better with respect the classic â vs a method.¹⁴⁸ Pavlović et al.¹⁴⁹ developed a M-POD for an ultrasonic inspection of a cast iron component. With this approach it was possible to compute several POD curves as a function of the desired variable holding the other parameter values. Yusa and Knopp pointed out that the M-POD in Pavlović et al.¹⁴⁹ was based on 12 coefficients which were not easy to compute, and that it is unlikely to have a uniform variance.¹⁵⁰ Therefore, they proposed a multi-parameter approach where even the variance is a function of the parameters instead of being constant. Alternatively, in Gao et al. can be found a linear mixed effect model describing the response of a vibrotermography test as a function of the vibration amplitude, pulse length, trigger force and crack length.¹⁵¹

Only recently, M-POD related to SHM systems with permanently mounted sensors have been studied. M-POD models are particularly attractive for SHM applications systems because they provide frameworks to include the extra variability sources typical of SHM systems. To bring this approach into the SHM world, the aid of numerical simulations seems to be unavoidable. For SHM, M-POD often requires the use of a model assisted approach.

Model-assisted probability of detection

Model-Assisted Probability of Detection (MAPOD) curves have their roots in NDE but at the same time provide a framework suitable for SHM studies.⁴⁹ An extensive review of MAPOD studies can be found in the Pacific Northwest report written by Meyer et al. in 2014.¹⁵² This research field was pioneered by Thompson, who guided from 2003 to 2010 the Model-Assisted Probability of Detection (MAPOD) Working Group at Iowa State University.¹⁵³ One objective of a MAPOD study is to reduce the amount of experimental data, required to generate a reliable POD, gathering additional information through a physics based model.¹⁵⁴ There are two main MAPOD variants: the transfer function approach (XFN) and full model-assisted methodology (FMA).^46,154

The XFN exploits the relationship existing between the output signal of real flaws and synthetic produced flaws which are easier and less costly to realize.¹⁵⁵ Using the XFN approach, starting from an existing fully empirical POD curve for a certain technique, it is possible to transfer these results to another similar configuration. The underling transfer function may be computed exploiting a physic based model or by specific laboratory tests.¹⁵⁶

The FMA approach aims to predict the signal strength of a certain NDE/SHM technique as a function of several parameters and flaw properties, capturing all the variability sources, combining the information provided by physics-based models with empirical knowledge¹⁵⁶ such as experimental noise.¹⁵⁷ The first attempts were made on ultrasonic testing methods but the FMA concept is general and can be applied to other sensing techniques.⁴⁶

Thompson concluded that the XFN and the FMA approaches were just two sides of the same coin.¹⁵⁶ In 2008 a unified approach for MAPOD was proposed in the form of a protocol,¹⁵⁴ and later on the methodology was included in the MIL-HKBK-1823A.¹⁶

Gianneo et al., taking the work of Pavlović et al.¹⁴⁹ as reference, leveraged MAPOD in a study concerning GLW for SHM systems with lightweight material.^83,85 From the M-POD curve (called in the paper “master” POD) the authors derived several conventional POD curves as a function of single parameters like the flaw size, the angle with respect the PZT sensors and the Lamb wave mode (A₀ or S₀). On the other hand, the remaining parameters were treated as random variables. When numerical models are employed, their success in capturing all the variability sources strongly depends on the known unknowns.¹⁵⁸ Previous knowledge about the important variables is indeed important to obtain variability data sources from experiments and to integrate them into the numerical models using noise signals as an example. For instance, Memmolo et al. decided to use a MAPOD approach for a GLW based SHM technique.¹⁵⁹ The variability sources were considered by adding a random noise to the FEM model output and randomly choosing parameters that are related to the damage such as its morphology and its position in the structure. Tschoke et al. studied the feasibility of MAPOD to produce POD maps in an automotive component made of Carbon Fiber Reinforced Polymers (CFRP) obtaining promising results.⁹⁰ This can be considered a M-POD since the additional parameter of the damage location is considered. Similarly, Leung and Corcoran evaluated the POD spatial distribution and combined this information with the probability of defect location.⁸⁷

Metamodels

In general, the computational effort increases with the number of variability-related parameters considered in the M-POD. High dimensionality problems require many model evaluations that need to be run, which can be a prohibitive process. A possible solution to lessen the computational burden is to leverage metamodels. Metamodels, sometimes referred to as surrogate models, are basically a simplified model of the original physics based model.¹⁶⁰ For example, CIVA^161,162 is a software that allows to use metamodels for Model Assisted POD studies both in NDE and SHM applications. Moreover, they can be used for other purposes such as sensitivity analysis evaluating the Sobol indices or to derive nonparametric POD curves.^160,163

Miorelli et al. show that in the CIVA software metamodels can be derived using the Output Space Filling Criterion or the Support Vector Regression algorithm. This kind of solution is particularly important in MAPOD studies. Engineers are often making assumptions about the probability distributions of the parameters related to variability source. These assumptions are difficult to verify, and a huge number of simulations is required to explore all the possible parameter combinations. Relying on conventional physics-based models leads to unfeasible computational times for practical applications. Dominguez et al. developed an algorithm to generate beams of POD curves and derive confidence bounds.¹⁶⁴ They generated a database to develop a surrogate model. The process is computational expensive but if beams of POD must be produced, the procedure becomes soon convenient.

Bayesian methods

Bayesian statistics can be regarded as another useful tool to manage the high amount of requested data.¹⁶⁵ It provides a mathematical framework to take advantage of prior knowledge for inference and decision making.¹⁶⁶ In this case the prior regards the quantity and type of damages present in the structure. Prior belief can then be updated once new experimental evidence is available. One may be wondering why this approach that seems to fit so nicely into this problem has not been considered in the past. The answer is that it can be very time consuming from a computational point of view. Nevertheless, recent progresses in terms of high-speed computing, made possible the easy implementation of Markow Chain and Monte Carlo techniques. These algorithms, combined with the use of physic-based models, allow the derivation of the likelihood required in the famous Bayes’ formula. Even in the case of having a poor informative prior or not having it at all, this approach can be applied simply considering a uniform distribution of the prior. In this way the posterior will not be affected by the prior belief and relies entirely on the likelihood. The likelihood could be derived experimentally but also exploiting physics-based models information.¹⁶⁷ Despite Bayesian statistics has already been leveraged in the field of NDE to develop POD curves,^168–171 to the best of the authors knowledge its use in SHM reliability studies is limited. The Bayesian approach has the benefit of exploit the full response of the measuring system in contrast with conventional methods where the only information that is considered is whether or not a certain threshold is exceeded.¹⁷² Therefore, this is a promising field of research for SHM that should be further investigated.

Fusion of probability of detection curves

An SHM system is made of several sensor-damage combinations. Moreover, different types of sensors may coexist in the same structure with the aim of providing complementary information. Therefore, for the same structure several POD curves are expected to be produced, each one related to a certain measuring technique. Ameyaw et al. applied the concept of POD curves to vibration based fault detection and isolation (FDI).^56,173,174 It was found that, depending on the sensor type, position and damage location, different POD curves are generated. Therefore, it was considered reasonable to develop a strategy to combine different POD curves related to different sensors. In this way all the available information is utilized, and the system reliability is expected to improve. Ameyaw et al. proposed a methodology in which, rather than fusing all the POD curves into a single POD curves, several belief values are computed using the Bayesian Combination Rule (BCR).^56,173,174 In this approach, all the possible sensor combinations are considered. For example, a certain damage could be detected (i.e., signal higher than the threshold value) only by certain sensors (each sensor with its own POD curve). By applying the BCR for each possible detection/missed detection combination, it is possible to derive a corresponding number of belief curve as a function of the damage size.

Nevertheless, it is often not desirable to fuse the curves as it dilutes the available information. It would be more appropriate to apply sensor fusion at lower level and derive a single POD curve using a single damage index combining features extracted from different sensors. Several fusion algorithms exist in the literature^11,175–177 and therefore it is reasonable to think that more than one strategy may be applied.

Localization and sizing metrics

Probability of localization

The second phase of a SHM system aims to localize damage and this section analyses the main progresses made to quantify the localization performance. Localization is only related to unknown damage location (UDL) SHM systems. Known damage location (KDL) SHM systems, sometimes also referred as hot-spot monitoring, do not require any localization as the damage position known. However, must be clarified that hot-spot monitoring is not NDE. Even if they share the fact that damage location is known, they cannot be treated using the conventional NDE methods because all the considerations regarding data correlation discussed in the previous sections. Nevertheless, damage localization remains an essential component of the SHM paradigm because (1) it is not always possible to identify hot spots where damage is likely to occur in the structure, and (2) unexpected events such as impacts, or unknown failure mechanism are always possible. Moreover, SHM systems can produce information beyond mere damage existence. Therefore, additional metrics are required to reach adequate reliability standards.

Aldrin et al. in their study claim that such metric should consist of an error with its corresponding uncertainty related confidence bounds.²³ They gave an example with a potential candidate, the so-called Normalized Localization Accuracy (NLA)

NLA = \sqrt{\frac{1}{N_{p}} \sum_{i = 1}^{N_{p}} {(\frac{ε_{i}^{p}}{p_{i}^{'}})}^{2}}

(18)

The terms in equation (18) are the number of location estimates, $N_{p}$ , the error of the i^th estimation, $ε_{i}^{p}$ , related to the location, $p$ , and a normalizing length factor, $p_{i}^{'}$ . The resulting NLA is then used to derive the confidence bounds (with a specified level of confidence, for instance 95%) around the damage location estimate. Gagar et al. investigated the location accuracy of Hsu-Nielson and fatigue crack AE sources using broad band piezoelectric sensors. They authors considered the cumulative frequency of error margin as measurand and plotted it against the error to obtain a probability curve capable to reflect the system performance.¹⁷⁸ In 2011 Flynn et al. proposed a novel damage localization algorithm, in the field of Guided-Wave (GW) propagation, based on the Rayleigh Maximum-Likelihood Estimate (RMLE).¹⁷⁹ The authors recognized the need of a statistical tool to compare the performance of their algorithm against other state of the art methodologies. After clarifying that the peak sharpness around the damage location in an image cannot be considered a reliable metric, two approaches were proposed. The first idea was to produce a density map of the localization probability density function (LPDF). Despite this method provides useful insights about the localization performance, its qualitative nature makes it not suitable to SHM, where quantitative metrics are required to make decisions. The second approach has its foundations in the ROC curve. Similarly, Flynn et al. introduced the localizer operating characteristic (LOC) curve, whose points are a measure of the likelihood of predicting a damage location position inside a certain area around the true location.¹⁷⁹ Therefore, every damage location is expected to show its own LOC and the global algorithm performance is assessed averaging multiple LOCs. Mallardo et al. in 2012 employed a genetic algorithm (GA) to solve an optimization problem regarding the OSP for impact localization in smart composite panels.⁹⁶ The evaluation of the fitness function, crucial in every GA, was achieved through the computation of the cumulative probabilistic distribution function (CDF), which is obtained integrating the probability distribution function (PDF). The PDF is the probability density function of locating a certain damage with respect different values of the error distance (distance between the true and computed locations). The CDF turned out to be a reliable metric, capable of being used for the computation of the fitness function.⁹⁶ Moriot et al. retrieved the LOC and the CDF methodologies and developed a Probability of Localization (POL) curve,^180,181 defined as the probability of locating the damage inside a tolerance circle of radius $ε$ . With that in mind, equation (20) represents the POL mathematical formulation as a function of $ε$ .

P O L (ϵ) = \frac{1}{K} \sum_{j = 1}^{K} H (ϵ - A E L_{j})

(19)

where K is the number of experiments and AEL is the absolute error of localization. In other words, it is the Euclidean distance between the computed location and the actual location of the flaw. In equation (21) (x _a, y_a ) are the real damage location coordinates whereas

({\hat{x}}_{a}, {\hat{y}}_{a})

are their corresponding estimation

A E L = \sqrt{{({\hat{x}}_{a} - x_{a})}^{2} + {({\hat{y}}_{a} - y_{a})}^{2}}

(20)

The symbol H represents the Heaviside-step function. In this way, are counted only the cases where $ϵ > A E L_{j}$ , which means only the cases where the estimated locations fall inside a circle with radius $ϵ$ and center the point of coordinates ( $x_{a}$ , $y_{a}$ ). Figure 6 illustrates the results using a synthetic dataset.

Figure 6.

POL computed according to the methodology of Moriot et al.¹⁸¹

The same authors introduced the concept of Model-Assisted Probability of Localization (MAPOL) as a tool to generate synthetic data and build POL curves. Despite this methodology represents a step forward to derive a reliable localization metric in analogy with common POD curves, it has the drawback of having confidence bounds without any meaning because the POL is not the result of any regression task. This lack of uncertainty evaluation capability makes this approach not appropriate for many applications where decisions are made according to an acceptable risk.

Yue and Aliabadi studied a hierarchical approach for determining the reliability of SHM systems using guided waves.¹⁸² The third level of such methodology regards damage localization and its performance metrics. They proposed to use the concepts of trueness and precision^183,184 to quantify the accuracy of the damage location estimations. The trueness, typically associated with a systematic error, is defined similarly to the AEL

T r u e n e s s = \sqrt{{(\bar{x} - x_{d})}^{2} + {(\bar{y} - y_{d})}^{2}}

(21)

where

(\bar{x}, \bar{y})

denote the mean coordinates of the estimated locations whereas

(x_{a}, y_{a})

are simply the true damage position coordinates. On the other hand, precision, which is usually compromised by random errors, is computed through the area of the ellipse linked to the covariance matrix of the estimated damage locations.¹⁸²

P r e c i s i o n = π a b \sqrt{χ_{2, 95 %}^{2}}

(22)

where a and b are the smallest and highest eigenvalues of the covariance matrix, and

χ_{2, 95 %}^{2}

is a two degree of freedom chi-square distribution at the 95% of confidence. An example is provided in Figure 7, where a synthetic dataset was employed.

Figure 7.

Location accuracy estimation with trueness and precision according to Yue and Aliabadi.¹⁸²

The same authors developed a probabilistic framework based on the Bayes’ law to quantify the probability of determining in the correct manner the damage location inside a selected area.¹⁸² Leung and Corcoran developed the interesting concept of Probability of Damage Location (PDL) maps⁸⁷ that can be mathematically expressed as

P D L (i) = \frac{P_{f} (i)}{\sum_{k = 1}^{n} P_{f} (k)}

(23)

where the numerator represents the probability of the damage being present at the i^th location, and the denominator is the sum of all these probabilities among the n number of discretized locations considered in the analysis.

There has been an increasing interest in the development of many different localization metrics. In the future much effort should be posed in defining a common methodology to lay the foundations for an accepted standard.

Probability of sizing

In the third phase of a SHM system, the main objective is to characterize the damage being previously detected and localized. Although the previous two phases do not leave room for any misinterpretation, damage characterization or identification can be confusing. Depending on the specific application there could be the necessity to quantify the damage size, classify the different damage shapes or different damage types. Taking composite material as an example, being able to classify matrix cracking, delamination, fiber breakage, and fiber pull-out is as important as determining the actual damage size.

The Probability of Sizing (POS) could be regarded as the probability of correct sizing a damage or a defect. In other words it describes the accuracy of estimating the size of a defect.¹⁸⁵ Attempts to evaluate the sizing accuracy of a certain measurement technique have already been made in the past.¹⁸⁶ For example, in Automated Ultrasonic Testing (AUT) researchers evaluate the sizing performance using the so-called safety Limit against Under Sizing (LUS). The LUS metric, also known as 95% LUS, can be thought as the lower 95% uncertainty bound of the linear regression model where the true size of a certain flaw (usually evaluated with destructive testing) is plotted against the value given by AUT.^187,188

Annis et al. recommend being particularly cautious in the use of the LUS metric. Indeed, the assumptions upon the LUS is based, such as the linearity in the response and the homoscedasticity of the variance are not necessarily true.⁵⁵ More in general some authors, in analogy with the “â vs a” analysis, perform a regression task between the measured versus actual damage sizes.^189–192 For example, Lee et al. quantify the reliability of sizing results for axial outside diameter stress corrosion cracks, spotted near the top of tube sheet in steam generator tubes.¹⁴⁴ They compute the coefficient of determination r ² related to the linear regression analysis between the true size, estimated by destructive examination, and the measured size, obtained by eddy current test (ECT). The r ² score is then used as a reference to estimate the sizing performance of the ECT technique. Ginzel et al. retrieved the equations originally developed by Ermolov in 1972¹⁹³ in order to predict the size of flaws given by ultrasonic methods.¹⁹⁴ The authors pointed out that the sizing accuracy depends on many parameters, depending on the measuring technique, the material, the structure layout, the defect orientation, etc. Nath et al. proposed to assess the reliability of the Time-of-Flight Diffraction (TOFD) inspection method in terms of POD and POS curves.^189,190,195 Specifically, POS curves are developed similarly to POD curves using the â vs a method but replacing the signal response with the measured defect size in the â value. Then, the decision threshold was set arbitrarily to a certain value or equal to the maximum difference between the measured flaw size â (the depth in that specific case), and the actual flaw size. Despite the fact Nath et al. claimed to have developed POS curves, the way such curves were built is not in compliance with the definition of POS.¹⁸⁵ Indeed, the curves built in such way represent the probability that the estimated defect size is greater than a certain size and not the probability of correctly sizing the defect itself. Alternatively, performing several inspections on a series of representative damages, it is possible to develop a probability density function relative to the damage severity which is used to obtain an upper bound on the damage size.^196,197

Aldrin et al. claim that a sizing metric should consist of an error with its corresponding uncertainty related confidence bounds.²³ They gave an example with a potential candidate, the so-called Normalized Quantification Accuracy (NQA), which is a metric analogous to the NLA (see Probability of localization) but related to sizing. In 2014, in the attempt of better formalizing the current sizing and localization metrics, was introduced the Characterization Error (CE), ê, which is the difference between the estimated damage characteristic (location, size, depth, width, etc.), â, and the actual damage state, a.⁴⁵ This new metric it is likely to be developed upon the mathematical framework provided in the MIL-HKBK-1823A, but is expected to be more complicated than traditional POD studies, requiring both engineering and statistical expertise to be applied. Poor characterization results could be attributed to a low signal to noise ratio, measurements close to the saturation level, ill-posed inversion problems, failure mechanisms which are independent with respect to the defect size.¹⁹⁸

Despite defect characterization, and hence sizing, represents the third fundamental level of SHM,³² to the best of the authors knowledge there are not specific case studies which have attempted to develop such metric for SHM systems.

Discussion and perspectives

In this article, the evolution of POD and the development of localization and sizing metrics has been described. This section summarizes the most relevant studies in the field (as shown in Table A1 in Appendix) and discusses future perspectives in SHM reliability metrics.

The first observation arising from the “Field” column of Table A1 is the progressive shifting from NDE toward SHM studies, confirming the growth of the SHM field.

The “Metric” column shows that most of the studies are related to POD curves and only few of them focus on localization and sizing. There is still a high heterogeneity in metrics for localization and sizing and there are not unique well-established definitions with an accepted standard. Moreover, the relationship between these metrics should be analyzed because SHM systems may have a hierarchical approach where damage detection, locating and sizing are performed in a subsequent order. The final decision regarding the reliability of a SHM system could be based considering all these metrics.

Sequential data analysis proved to be efficient to address the important challenge of dealing with serially correlated time series data in SHM due to varying EOCs, defect morphology and sensor drift.

Statistical methodologies such as the LaD, LMM, and REM are becoming the new standards in the field. Although the LaD is a relatively simple approach, the LMM and the REM are more complex and require a sound understanding of advanced statistical tools.

The use of M-POD in SHM requires the aid of numerical models to compute POD curves as a function of several parameters. MAPOD are therefore becoming fundamental tools for the derivation of M-POD. The biggest challenge is being capable of capturing all the variability sources of the system. As the number of parameters considered increases, the computational time rises tremendously due to the curse of dimensionality. Metamodels are good candidates to solve this challenge since they can reduce the simulation time of many orders of magnitudes, allowing the production of beams of POD curves.

When multiple sensing systems are employed, POD fusion methodologies can be used. This topic is still in its infancy and is not clear yet if it makes sense or not to fuse different POD curves since it may dilute the available information. However, fusing different sensors information at a lower level could produce more effective damage indices, thus improving POD curves.

There are no methodologies available in the literature to handle hit/miss data in SHM because all the POD for SHM stem from the â vs a case. When hit/miss data are present in SHM they are erroneously treated with conventional methodologies used in NDE. Therefore, there is the need to further investigate this topic.

The “Sensing” column showcases all the sensing techniques used in reliability assessment studies. A multitude of measurement techniques have been used but at the same time others are unexpectedly rare. For instance, despite acoustic emissions¹⁹⁹ and distributed optical fiber sensors²⁰⁰ showed to be promising technologies in the field of SHM, reliability studies concerning these techniques in terms of POD curves are still limited to few studies.

The lack of specific reliability metrics is one of the major impediments for the validation and certification of SHM system. The “Objective” column highlights that there are many studies trying to assess the reliability of a specific system, but only few studies focus on the development of new reliability metrics. This discrepancy suggests that more effort should be posed in the derivation of common standard procedures to evaluate the performance of SHM systems in terms of detection, localization and sizing.

Even though a SHM study should be conducted with a real representative structure rather than simple specimens (as is typically done in NDE), this is not entirely reflected in the “Material and Structure” column of Table A1. Despite few exceptions,^88,119 most of the analyzed literature uses simplified structure components or even specimens to develop POD curves.

The situation highlights two requirements for a POD study in SHM that go in opposite directions: the need of reproducing a test structure as similar as possible to reality which is capable to capture all the variability sources, and the need of many such structures to obtain a statistically relevant study. The number of SHM studies leveraging numerical models for the development of POD curves has an increasing trend, as shown in the “Numerical Analysis” column. However, it is often difficult to substitute completely experimental data since the model could not be able to represent accurately the real structure or to capture or the variability source. A hybrid approach where experiments and numerical simulations are both exploited seems to be the most promising strategy.⁸⁸ Bayesian statistics could be used to integrate these two kinds of information. To the best of the authors knowledge this has only been done for NDE POD studies and should be object of further investigation also for SHM systems where the issue of having a large amount of data is even more exacerbated.

The column title “Damage and its Estimation” of Table A1 emphasizes that POD curves are plotted as a function of the estimated damage size rather than the true damage size. This point has never been addressed because it was implicitly assumed that the measurement technique used to address the true damage size was much more accurate than the value provided by the SHM system. Even if this assumption can hold, in many cases it does not or it is simply impossible to prove it. Therefore, this topic should be further investigated to implement the crack length estimation uncertainty in the current SHM methods to derive POD curves.

The concluding remarks of this review article are summarized below:

• The SHM field is increasing faster than the development of its reliability metrics. There is the need of further research to develop statistical methods capable of quantifying the performance of SHM systems.

• Is not possible to use conventional NDE POD curves in SHM.

• POD curves are usually developed assuming zero uncertainty in the damage size which is not always acceptable.

• Sequential data analysis can be used to study SHM systems under varying EOCs and to address problems where sensor drift is present. It is a promising research field which deserves further research.

• Statistical models to generate POD in presence of dependent data are available (LaD and REM) but few studies use them to produce POD curves.

• Multiple parameters can be considered using M-POD. In M-POD studies the use of MAPOD is fundamental. The curse of dimensionality can be addressed using metamodels.

• Fusing data from several sensor sources may improve the POD and ROC curves.

• Capturing all the variability sources is the key to obtain reliable numerical models and meaningful experiments.

• Bayesian methods can be used to combine experimental and numerical data in POD studies for SHM. Recent studies showed that this approach is promising.

• Despite recent studies developed metrics for localization and sizing, there is not a common well-accepted standard as for POD curves. Research is needed to produce a protocol capable of unifying these scattered efforts, using statistical tools capable to produce confidence intervals.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Francesco Falcetelli

Appendix

References

Derriso

DeSimio

McCurry

, et al. Industrial Age non-destructive evaluation to Information Age structural health monitoring. Struct Health Monit 2014; 13: 591–600.

Farrar

Worden

. An introduction to structural health monitoring. Philosophical Trans R Soc A: Math Phys Eng Sci 2007; 365: 303–315.

SAE International . Aerospace Recommended Practice: Guidelines for Implementation of Structural Health Monitoring on Fixed Wing Aircraft, ARP6461, 2013.

Balageas

Fritzen

C-P

Güemes

(eds) Structural health monitoring. London; Newport Beach, CA: ISTE; 2006.

Boller

Chang

F-K

Fujino

(eds) Encyclopedia of structural health monitoring. Chichester, West Sussex, UK: John Wiley; 2009.

Aliabadi

Sharif-Khodaei

(eds) Structural health monitoring for advanced composite structures. New Jersey: World Scientific; 2017.

Roach

. Validation and Verification Processes to Certify SHM Solutions for Commercial Aircraft Applications. SAND2013-7867C. Albuquerque, New Mexico: Sandia National Labs, 2013.

Achenbach

. Structural health monitoring - What is the prescription? Mech Res Commun 2009; 36: 137–142.

Roach

Rice

Swindell

. Convergence of Multiple Statistical Methods for Calculating the Probability of Detection from SHM Sensor Networks. In: Structural Health Monitoring 2017. Lancaster, PA: DEStech Publications, Inc.; 2017. Epub ahead of print 28 September 2017. DOI: 10.12783/shm2017/14209.

10.

Roach

Rice

. Wide Area Monitoring of Aircraft Structures Using Acousto-Ultrasonic Sensor Networks. China: Xiamen, https://www.osti.gov/servlets/purl/1398324 (2016, accessed 12 October 2020).

11.

Farrar

Worden

. Structural Health Monitoring: A Machine Learning Perspective. Chichester, West SussexHoboken, NJ: Wiley, 2012.

12.

Roach

. Does the Maturity of Structural Health Monitoring Technology match user Readiness? In: In Proceedings of the International Workshop on SHM. Stanford, CA, USA, 2011.

13.

Seaver

Chattopadhyay

Papandreou-Suppapola

, et al. Workshop on transitioning structural health monitoring technology to military platforms. J Intell Mater Syst Structures 2013; 24: 2063–2073.

14.

Cawley

. Structural health monitoring: closing the gap between research and industrial deployment. Struct Health Monit 2018; 17: 1225–1244.

15.

Cawley

. A development strategy for structural health monitoring applications. J Nondestructive Eval Diagn Prognostics Eng Syst 2021; 4: 041012.

16.

Department of Defense Handbook . Nondestructive Evaluation System Reliability Assessment, 2009.

17.

Lindgren

Buynak

Aldrin

, et al. Model-assisted Methods for Validation of Structural Health Monitoring Systems. In: Proceedings of the 7th International Workshop on Structural Health Monitoring. Stanford, CA; 2009, pp. 2188–2195.

18.

Aldrin

Medina

Santiago

, et al. Demonstration Study for Reliability Assessment of SHM Systems Incorporating Model-Assisted Probability of Detection Approach. Burlington, VT, pp. 1543–1550.

19.

Lindgren

Buynak

. The need and requirements for validating damage detection capability. In: Proceedings of the 8th International Workshop on SHM. DEStech Publications, Inc.; 2011, pp. 2444–2451.

20.

Chang

. The Need of SHM Quantification for Implementation, https://hal.inria.fr/hal-01010064 (2014, accessed 27 April 2021).

21.

Güemes

Fernandez-Lopez

Pozo

, et al. Structural Health Monitoring for Advanced Composite Structures: A Review. J Compos Sci 2020; 4: 13.

22.

Kabban

Derriso

. Certification in structural health monitoring systems. In: Proceedings of the 8th International Workshop on SHM. Stanford, CA, USA, pp. 2429–2436.

23.

Aldrin

Medina

Lindgren

, et al. Model-Assisted Probabilistic Reliability Assessmnt for Structural Health Monitoring Systems. AIP, 1965–1972.

24.

Medina

Aldrin

. Value Assessment Approaches for Structural Life Management. In: Boller

Chang

F-K

Fujino

(eds). Encyclopedia of Structural Health Monitoring. Chichester, UK: John Wiley & Sons, p. shm189.

25.

Department of Defense . Procedures for Performing a Failure Mode, Effects, and Criticality Analysis. Washington, DC: MIL-STD-1629A, 1980.

26.

Aldrin

Medina

Lindgren

, et al. Protocol for Reliability Assessment of Structural Health Monitoring Systems Incorporating Model-assisted Probability of Detection (MAPOD) Approach. In: Proceedings of the 8th international workshop on structural health monitoring. Stanford; 2011, pp. 2452–2459.

27.

Environmental Conditions and Test Procedures for Airborne Equipment. RTCA Paper No. 111-04/SC135-645, Washington, DC, 2005.

28.

Department of Defense Test Method Standard for Environmental Engineering Considerations . 2000.

29.

Department of Defense Interface Standard Requirements for the Control of Electromagnetic Interference Characteristics of Subsystems and Equipment. 1999.

30.

Kessler

. Certifying a Structural Health Monitoring System: Characterizing Durability, Reliability and Longevity. In: Proceedings of the 1st International Forum on Integrated Systems Health Engineering and Management in Aerospace. Napa, CA; 2005.

31.

Janapati

Kopsaftopoulos

, et al. Damage detection sensitivity characterization of acousto-ultrasound-based structural health monitoring techniques. Struct Health Monit 2016; 15: 143–161.

32.

Rytter

. Vibrational Based Inspection of Civil Engineering Structures. PhD Thesis, Dept. Of Building Technology and Structural Engineering. Aalborg , Denmark: Aalborg University, 1993.

33.

Doebling

Farrar

Prime

, et al. Damage Identification and Health Monitoring of Structural and Mechanical Systems from Changes in Their Vibration Characteristics: A Literature Review. LA--13070-MS, p. 249299.

34.

Derriso

McCurry

Schubert Kabban

. A novel approach for implementing structural health monitoring systems for aerospace structures. In: Structural Health Monitoring (SHM) in Aerospace Structures. Elsevier, pp. 33–56.

35.

Department of Defense Standard Practice. Aircraft Structural Integrity Program (ASIP). MIL-STD-1530c, United States Air Force, 2005.

36.

Monaco

Memmolo

Ricci

, et al. Guided waves based SHM systems for composites structural elements: statistical analyses finalized at probability of detection definition and assessment. In: Kundu

(ed) San Diego. CA, p. 94380M.

37.

Casella

Berger

. Statistical Inference. 2nd ed. Australia; Pacific Grove, CA: Thomson Learning, 2002.

38.

Annis

. False Positives, https://statistical-engineering.com/false-positives/(accessed 22 December 2020).

39.

Berens

. NDE reliability data analysis. In: Metals handbook Volume 17: Nondestructive evaluation and quality control. Metals Park, Oh: American Society for Metals; 1989, pp. 689–701.

40.

Kanzler

Müller

. How Much Information Do We Need? A Reflection of the Correct Use of Real Defects in POD-Evaluations. Minneapolis, MN, 2008.

41.

Forsyth

. On the independence of multiple inspections and the resulting probability of detection. In: AIP Conference Proceedings . Montreal, Canada. AIP, pp. 2159–2166.

42.

Wright

. How to Implement a PoD into a Highly Effective Inspection Strategy. Burlington, ON: Canada: NDT.net, https://www.ndt.net/events/NDTCanada2016/app/content/Paper/44_Wright.pdf (2016, accessed 29 December 2020).

43.

Petrin

Annis

Vukelich

. A recommended methodology for quantifying NDE/NDI based on aircraft engine experience. AGARD LECTURE SERIES 1993; 190.

44.

Achenbach

. Quantitative nondestructive evaluation. Int J Sol Structures 2000; 37: 13–27.

45.

Aldrin

Annis

Sabbagh

, et al. Assessing the Reliability of Nondestructive Evaluation Methods for Damage Characterization. Baltimore, MD, 2078, p. 2071.

46.

Meeker

WQRB

. Thompson’s Contributions to Model Assisted Probability of Detection. Burlington, VT, pp. 83–94.

47.

Gandossi

Simola

. Derivation and use of probability of detection curves in the nuclear industry. Insight - Non-Destructive Test Condition Monit 2010; 52: 657–663.

48.

Kanzler

Müller

Pitkänen

. Combining of Different Data Pools for Calculating a Reliable POD for Real Defects. Idaho: Boise, 1924–1932.

49.

Meeker

Roach

Kessler

. Statistical Methods for Probability of Detection in Structural Health Monitoring. In: Structural Health Monitoring 2019. DEStech Publications, Inc.; 2019. Epub ahead of print 15 November 2019. DOI: 10.12783/shm2019/32095.

50.

Annis

Gandossi

Martin

. Optimal sample size for probability of detection curves. Nucl Eng Des 2013; 262: 98–105.

51.

Gandossi

Annis

. ENIQ Report No. 47. Luxembourg: Publications Office, 2012.Influence of sample size and other factors on Hit/Miss Probability of Detection Curves

52.

Annis

Gandossi

. Probability of Detection Curves: Statistical Best-Practices. ENIQ Report No 41. Luxembourg: Publications Office of the European Union, https://publications.jrc.ec.europa.eu/repository/bitstream/111111111/15527/1/reqno_jrc56672_jrc56672_eniq_report_41.pdf%5B1%5D.pdf (2010, accessed 12 September 2020).

53.

Koh

Y-M

Meeker

. Methods for Planning a Statistical POD Study. Denver, CO, pp. 1725–1732.

54.

Aldrin

Annis

Sabbagh

, et al. Best practices for evaluating the capability of nondestructive evaluation (NDE) and structural health monitoring (SHM) techniques for damage characterization. In: AIP Conference Proceeding, Minneapolis, MN, 2016. Epub ahead of print 2016. DOI: 10.1063/1.4940646.

55.

Annis

Aldrin

Sabbagh

. What is Missing in Nondestructive Testing Capability Evaluation?. Mater Eval; 73.

56.

Ameyaw

Rothe

Söffker

. A novel feature-based probability of detection assessment and fusion approach for reliability evaluation of vibration-based diagnosis systems. Struct Health Monit 2020; 19: 649–660.

57.

Schubert Kabban

Greenwell

DeSimio

, et al. The probability of detection for structural health monitoring systems: Repeated measures data. Struct Health Monit 2015; 14: 252–264.

58.

Swets

. Measuring the accuracy of diagnostic systems. Science 1988; 240: 1285–1293.

59.

Fawcett

. An introduction to ROC analysis. Pattern Recognition Lett 2006; 27: 861–874.

60.

Rouhan

Schoefs

. Probabilistic modeling of inspection results for offshore structures. Struct Saf 2003; 25: 379–399.

61.

Schoefs

Clément

Nouy

. Assessment of ROC curves for inspection of random fields. Struct Saf 2009; 31: 409–419.

62.

Fawcett

. ROC Graphs: Notes and Practical Considerations for Data Mining Researchers. Technical Report HPL-2003–4. Palo Alto, CA: HP Laboratories, 2003.

63.

Meeker

Jeng

S-L

Chiou

C-P

, et al. Improved Methodology for Predicting POD of Detecting Synthetic Hard Alpha Inclusions in Titanium. In: Thompson

Chimenti

(eds) Review of Progress in Quantitative Nondestructive Evaluation. Boston, MA: Springer US, pp. 2021–2028.

64.

Oehlert

. A Note on the Delta Method. The Am Statistician 1992; 46: 27–29.

65.

Cox

. Delta Method. In: Armitage

Colton

(eds). Encyclopedia of Biostatistics. Chichester, UK: John Wiley & Sons, p. b2a15029.

66.

Annis

. How Hit/miss Models Work, https://statistical-engineering.com/how-hit-miss-models-work/(accessed 14 December 2020).

67.

Virkkunen

Koskinen

Papula

, et al. Comparison of â Versus a and Hit/Miss POD-Estimation Methods: A European Viewpoint. J Nondestructive Eval 2019; 38: 89.

68.

Knopp

Grandhi

Zeng

, et al. Considerations for statistical analysis of nondestructive evaluation data: hit/miss analysis. E-Journal Adv Maintenance 2012; 4: 105–115.

69.

Standard Practice for Probability of Detection Analysis for Hit/Miss Data. ASTM E2862-18.

70.

Harding

. Statistical Analysis of Probability of Detection Hit/Miss Data for Small Data Sets. In: AIP Conference Proceedings. Bellingham. Washington (USA): AIP, pp. 1838–1845.

71.

Annis

Knopp

. Comparing the Effectiveness of a90/95 Calculations. In: AIP Conference Proceedings. Portland, OR: AIP, pp. 1767–1774.

72.

Spencer

. The Calculation and Use of Confidence Bounds in POD Models. In: AIP Conference Proceedings. Portland, OR: AIP; 1791-1798.

73.

Cheng

RCH

Iles

. Confidence Bands for Cumulative Distribution Functions of Continuous Random Variables. Technometrics 1983; 25: 77–86.

74.

Cheng

RCH.

lies

. One-Sided Confidence Bands for Cumulative Distribution Functions. Technometrics 1988; 30: 155–159.

75.

Spencer

Meeker

. Distinguishing between Uncertainty and Variability in Nondestructive Evaluation. Burlington, VT, pp. 1725–1732.

76.

Meeker

Thompson

, et al. PHYSICAL MODEL ASSISTED PROBABILITY OF DETECTION IN NONDESTRUCTIVE EVALUATION. San Diego, CA, pp. 1541–1548.

77.

Berens

Hovey

. Statistical Methods for Estimating Crack Detection Probabilities. In: Bloom

Ekvall

(eds) Probabilistic Fracture Mechanics and Fatigue Methods: Applications for Structural Design and Maintenance. 100 Barr Harbor Drive, PO Box C700. West Conshohocken, PA 19428-2959: ASTM International, pp. 79–94.

78.

Forsyth

. Structural Health Monitoring and Probability of Detection Estimation. Minneapolis, MN, 2004.

79.

Müller

Janapati

Banerjee

, et al. On the performance quantification of active sensing SHM systems using model-assisted POD methods. In: Proceedings of the 8th international workshop on structural health monitoring. Stanford, CA: DEStech Publications, Inc.; 2011, pp. 2417–2428.

80.

Janapati

Kopsaftopoulos

Roy

, et al. Sensor network configuration effect on detection sensitivity of an acoustoultrasound- based Active SHM system. In: Proceedings of the 9th International Workshop on Structural Health Monitoring. Stanford, CA: DEStech Publ; 2013, pp. 2147–2156.

81.

Pado

Ihn

J-B

Dunne

. Understanding Probability of Detection (POD) in Structure Health Monitoring systems. In: Proceedings of the 9th International Workshop on Structural Health Monitoring. Stanford, CA: DEStech Publications, Inc.; 2013, pp. 2107–2114.

82.

Hayo

Frankenstein

Boller

, et al. Approach to the Technical Qualification of a SHM System in terms of Damage Detection in Aerospace Industry. In: Proceedings of the International Workshop Smart Materials, Structures & NDT in Aerospace, https://www.ndt.net/article/ndtcanada2011/papers/76_Hayo_Rev2.pdf (2011, accessed 10 April 2021).

83.

Gianneo

Carboni

Giglio

. A Preliminary Study of Multi-Parameter POD Curves for a Guided Waves Based SHM Approach to Lightweight Materials. Minneapolis, MN, p. 030018.

84.

Sohn

. Effects of environmental and operational variability on structural health monitoring. Philosophical Trans R Soc A: Math Phys Eng Sci 2007; 365: 539–560.

85.

Gianneo

Carboni

Giglio

. Feasibility study of a multi-parameter probability of detection formulation for a Lamb waves-based structural health monitoring approach to light alloy aeronautical plates. Struct Health Monit 2017; 16: 225–249.

86.

Mandache

Genest

Khan

, et al. Considerations on structural health monitoring reliability. In: Proceedings of the International Workshop Smart Materials, Structures & NDT in Aerospace. Montreal, Quebec https://www.ndt.net/article/ndtcanada2011/papers/17_Mandache_Rev1.pdf (2011, accessed 26 April 2021).

87.

Leung

MSH

Corcoran

. Evaluating the probability of detection capability of permanently installed sensors using a structural integrity informed approach. J Nondestructive Eval 2021; 40: 82.

88.

Liu

Dobson

Cawley

. Efficient generation of receiver operating characteristics for the evaluation of damage detection in practical structural health monitoring applications. Proc R Soc A: Math Phys Eng Sci 2017; 473: 20160736.

89.

Howard

Cegla

. Detectability of corrosion damage with circumferential guided waves in reflection and transmission. NDT E Int 2017; 91: 108–119.

90.

Tschoke

Mueller

Memmolo

, et al. Feasibility of model-assisted probability of detection principles for structural health monitoring systems based on guided waves for fibre-reinforced composites. IEEE Trans Ultrason Ferroelect, Freq Contr 2021: 1.

91.

Tan

Zhang

. Computational methodologies for optimal sensor placement in structural health monitoring: A review. Struct Health Monit 2020; 19: 1287–1308.

92.

Flynn

Todd

. A Bayesian approach to optimal sensor placement for structural health monitoring with application to active sensing. Mech Syst Signal Process 2010; 24: 891–903.

93.

Azarbayejani

El-Osery

Choi

, et al. A probabilistic approach for optimal sensor allocation in structural health monitoring. Smart Mater Structures 2008; 17: 055019.

94.

Markmiller

JFC

Chang

F-K

. Sensor network optimization for a passive sensing impact detection technique. Struct Health Monit 2010; 9: 25–39.

95.

Mallardo

Sharif Khodaei

Aliabadi

. A bayesian approach for sensor optimisation in impact identification. Materials 2016; 9: 946.

96.

Mallardo

Aliabadi

Khodaei

. Optimal sensor positioning for impact localization in smart composite panels. J Intell Mater Syst Structures 2013; 24: 559–573.

97.

Yan

Laflamme

Leifsson

. Computational framework for dense sensor network evaluation based on model-assisted probability of detection. Mater Eval 2020; 78: 573–583.

98.

Chen

C-D

Chiu

Y-C

Huang

Y-H

, et al. Assessments of structural health monitoring for fatigue cracks in metallic structures by using lamb waves driven by piezoelectric transducers. J Aerospace Eng 2021; 34: 04020091.

99.

Grooteman

. Damage detection and probability of detection for a SHM system based on optical fibres applied to a stiffened composite panel. In: Proceedings of the 25th International Conference on Noise and Vibration engineering, ISMA2012 in conjunction with the 4th International Conference on Uncertainty in Structural Dynamics. Leuven, Belgium: USDKatholieke Universiteit Leuven; 2012, pp. 3317–3330.

100.

Tabjula

Kanakambaran

Kalyani

, et al. Outlier analysis for defect detection using sparse sampling in guided wave structural health monitoring. Struct Control Health Monit 2021; 28. DOI: 10.1002/stc.2690. Epub ahead of print March 2021.

101.

Liu

Chang

F-K

. Generating Damage Probability-Of-Detection Curves in Structural Health Monitoring Transducer Networks. US8069011B2.

102.

Worden

Manson

Fieller

NRJ

. Damage detection using outlier analysis. J Sound Vibration 2000; 229: 647–667.

103.

Roberts

. Novelty detection using extreme value statistics. IEE Proc - Vis Image, Signal Process 1999; 146: 124–129.

104.

Markou

Singh

. Novelty detection: a review-part 1: statistical approaches. Signal Process 2003; 83: 2481–2497.

105.

Markou

Singh

. Novelty detection: a review-part 2:. Signal Process 2003; 83: 2499–2521.

106.

Cobb

Fisher

Michaels

. Model-assisted probability of detection for ultrasonic structural health monitoring. In: European-American Workshop on Reliability of NDE. Berlin, Germany, https://www.ndt.net/article/reliability2009/Inhalt/th2a2.pdf (2009, accessed 16 April 2021).

107.

Memmolo

Maio

Boffa

, et al. Damage detection tomography based on guided waves in composite structures using a distributed sensor network. Opt Eng 2015; 55: 011007.

108.

Yue

Khodaei

Aliabadi

. Damage detection in large composite stiffened panels based on a novel SHM building block philosophy. Smart Mater Structures 2021; 30: 045004.

109.

Shook

Millwater

Enright

, et al. Simulation of Recurring Automated Inspections on Probability-of-Fracture Estimates. Struct Health Monit 2008; 7: 293–307.

110.

Mariani

Heinlein

Cawley

. Compensation for temperature-dependent phase and velocity of guided wave signals in baseline subtraction for structural health monitoring. Struct Health Monit 2020; 19: 26–47.

111.

Raghavan

Cesnik

CES

. Effects of Elevated Temperature on Guided-wave Structural Health Monitoring. J Intell Mater Syst Structures 2008; 19: 1383–1398.

112.

Clarke

Simonetti

Cawley

. Guided wave health monitoring of complex structures by sparse array systems: Influence of temperature changes on performance. J Sound Vibration 2010; 329: 2306–2322.

113.

Michaels

. A methodology for structural health monitoring with diffuse ultrasonic waves in the presence of temperature variations. Ultrasonics 2005; 43: 717–731.

114.

Croxford

Wilcox

Drinkwater

, et al. Strategies for guided-wave structural health monitoring. Proc R Soc A: Math Phys Eng Sci 2007; 463: 2961–2981.

115.

Michaels

. Detection, localization and characterization of damage in plates with anin situarray of spatially distributed ultrasonic sensors. Smart Mater Structures 2008; 17: 035035.

116.

Harley

Moura

. Scale transform signal processing for optimal ultrasonic temperature compensation. IEEE Transactions Ultrasonics, Ferroelectrics, Frequency Control 2012; 59: 2226–2236.

117.

Fendzi

Rébillat

Mechbal

, et al. A data-driven temperature compensation approach for Structural Health Monitoring using Lamb waves. Struct Health Monit 2016; 15: 525–540.

118.

Yue

Aliabadi

. A scalable data-driven approach to temperature baseline reconstruction for guided wave structural health monitoring of anisotropic carbon-fibre-reinforced polymer structures. Struct Health Monit 2020; 19: 1487–1506.

119.

Mariani

Cawley

. Change detection using the generalized likelihood ratio method to improve the sensitivity of guided wave structural health monitoring systems. Struct Health Monit 2020: 147592172098183.

120.

Mariani

Heinlein

Cawley

. Location Specific Temperature Compensation of Guided Wave Signals in Structural Health Monitoring. IEEE Trans Ultrason Ferroelectrics, Frequency Control 2020; 67: 146–157.

121.

Guided Ultrasonics Ltd., https://www.guided-ultrasonics.com/(accessed 4 October 2021).

122.

Mariani

. Signal Processing, WO2020058663.

123.

Lai

. Sequential analysis: some classical problems and new challenges. Stat Sinica 2001; 11: 303–351.

124.

Mariani

Liu

Cawley

. Improving sensitivity and coverage of structural health monitoring using bulk ultrasonic waves. Struct Health Monit 2020: 147592172096512.

125.

Mariani

Rendu

Urbani

, et al. Causal dilated convolutional neural networks for automatic inspection of ultrasonic signals in non-destructive evaluation and structural health monitoring. Mech Syst Signal Process 2021; 157: 107748.

126.

Oord

AVD

Dieleman

Zen

, et al. WaveNet: A Generative Model for Raw Audio. arXiv:160903499 [cs], http://arxiv.org/abs/1609.03499 (2016. accessed 6 October 2021).

127.

Roach

Rackow

DeLong

, et al. Use of Composite Materials, Health Monitoring, and Self-Healing Concepts To Refurbish Our Civil and Military Infrastructure. SAND2007-5547. Albuquerque: New Mexico 87185 and Livermore, California 94550: Department of Energy - Sandia National Laboratories, https://core.ac.uk/download/pdf/71318799.pdf (2007, accessed 12 September-November 2020).

128.

Roach

. Real time crack detection using mountable comparative vacuum monitoring sensors. Smart Structures Syst 2009; 5: 317–328.

129.

Sbarufatti

Corbetta

San Millan

, et al. Model-Assisted Performance Qualification of a Distributed SHM System For Fatigue Crack Detection on a Helicopter Tail Boom. Bilbao, Spain, EWSHM. https://www.ndt.net/search/docs.php3?id=19881&msgID=0&rootID=0 (2016, accessed 8 October 2021).

130.

Sbarufatti

Giglio

. Performance qualification of an on-board model-based diagnostic system for fatigue crack monitoring. J Am Helicopter Soc 2017; 62: 1–10.

131.

Roach

Rice

Neidigk

, et al. Establishing the Reliability of SHM Systems Through the Extrapolation of NDI Probability of Detection Principles. In: Structural Health Monitoring. Destech Publications; 2015. Epub ahead of print 2015. DOI: 10.12783/SHM2015/330.

132.

Anderson

Darling

. A Test of Goodness of Fit. J Am Stat Assoc 1954; 49: 765–769.

133.

Meeker

Hahn

Escobar

. Statistical Intervals: A Guide for Practitioners and Researchers. Second edition. Hoboken, NJ: Wiley, 2017.

134.

Roach

. Calculating Probability of Detection for SHM Systems Using One-Sided Tolerance Intervals – Applications & Limitations. SAND2015-3998PESandia National Labs, https://www.osti.gov/servlets/purl/1258205 (2015, accessed 3 April 2021).

135.

Roach

Swindell

. Generating Viable Data to Accurately Quantify the Performance of SHM Systems. In: Structural Health Monitoring 2019. DEStech Publications, Inc.; 2019. Epub ahead of print 15 November 2019. DOI: 10.12783/shm2019/32220.

136.

Krishnamoorthy

Mathew

. Statistical Tolerance Regions: Theory, Applications, and Computation. Hoboken, NJ: Wiley, 2009.

137.

Aitken

. IV.-on least squares and linear combination of observations. Proc R Soc Edinb 1936; 55: 42–48.

138.

Kessler

Dunn

Swindell

, et al. Detection Sensitivity Analysis for a Potential Drop (PD) Structural Health Monitoring (SHM) System. In: Proceedings of the 12th International Workshop on Structural Health Monitoring. Stanford, CA, USA: DEStech Publications, Inc.; 2019. Epub ahead of print 10 September 2019. DOI: 10.12783/shm2019/32219.

139.

Swindell

Doyle

Roach

. Integration of Structural Health Monitoring Solutions onto Commercial Aircraft via the Federal Aviation Administration Structural Health Monitoring Research Program. Atlanta, GA, USA, p. 070001.

140.

O’Connor

. Quantifying Method Differences in Predicting the Probability of Detection for Structural Health Monitoring ApplicationsMaster of Science (MS) in Statistics. Iowa State University, 2019, https://lib.dr.iastate.edu/cgi/viewcontent.cgi?article=1268&context=creativecomponents (accessed 1 January 2021).

141.

Kessler

Thomas

Borgen

, et al. Carbon Nanotube Appliques for Fatigue Crack Diagnostics. In: Structural Health Monitoring. Destech Publications; 2015. Epub ahead of print 2015. DOI: 10.12783/SHM2015/207.

142.

Mishra

Yadav

Chang

. Reliability of Probability of Detection (POD) of Fatigue Cracks for Built-in Acousto-Ultrasound Technique as "in-situ" NDE. In: Structural Health Monitoring 2019. DEStech Publications, Inc.; 2019. Epub ahead of print 15 November 2019. DOI: 10.12783/shm2019/32506.

143.

Bode

Ashbaugh

Boyce

, et al. Corrosion structured experiment. In: AIP Conference Proceedings. Brunswick. Maine: AIP, pp. 1779–1786.

144.

Lee

Park

Kim

, et al. Evaluation of ECT reliability for axial ODSCC in steam generator tubes. Int J Press Vessels Piping 2010; 87: 46–51.

145.

Hoppe

. A Parametric Study of Eddy Current Response for Probability of Detection Estimation. Kingston (Rhode Island), 1895–1902.

146.

Aldrin

Sabbagh

Murphy

, et al. Demonstration of Model-Assisted Probability of Detection Evaluation Methodology for Eddy Current Nondestructive Evaluation. Burlington, VT, pp. 1733–1740.

147.

Aldrin

Knopp

Lindgren

, et al. Model-assisted probability of detection evaluation for eddy current inspection of fastener sites. In: AIP Conference Proceedings. Chicago, IL: AIP, pp. 1784–1791.

148.

Aldrin

Knopp

Sabbagh

. Bayesian Methods in Probability of Detection Estimation and Model-Assisted Probability of Detection Evaluation. Denver, CO, pp. 1733–1740.

149.

Pavlović

Takahashi

Müller

. Probability of detection as a function of multiple influencing parameters. Insight 2012; 54: 606–611.

150.

Yusa

Knopp

. Evaluation of probability of detection (POD) studies with multiple explanatory variables. J Nucl Sci Tech 2016; 53: 574–579.

151.

Gao

Meeker

Mayton

. Detecting cracks in aircraft engine fan blades using vibrothermography nondestructive evaluation. Reliability Eng Syst Saf 2014; 131: 229–235.

152.

Meyer

Crawford

Lareau

, et al. Review of Literature for Model Assisted Probability of Detection. PNNL--23714, p. 1183633.

153.

Model-Assisted POD Working Group, https://static.cnde.iastate.edu/mapod/index.htm (accessed 1 March 2021).

154.

Thompson

Chimenti

. A Unified Approach to the Model-Assisted Determination of Probability of Detection. In: AIP Conference Proceedings. Golden, CO. AIP, pp. 1685–1692.

155.

Smith

. POD Transfer Function Approach. Palm Springs, CA: MAPOD Working Group Meeting, https://static.cnde.iastate.edu/mapod/2005%20February/POD%20Transfer%20Function%20Approach_Smith.pdf (accessed February-June 2005-2021).

156.

Thompson

Brasche

Forsyth

, et al. Recent Advances in Model-Assisted Probability of Detection. Berlin, Germany: NDT.net, https://www.ndt.net/article/reliability2009/Inhalt/we1a1.pdf (2009, accessed 1 March 2021).

157.

Gallina

Paćko

Ambroziński

. Model assisted probability of detection in structural health monitoring. In: Advanced Structural Damage Detection: From Theory to Engineering Applications. Wiley, pp. 57–72.

158.

Austin

Ziehl

, et al. Development and validation of Acoustic Emission structural health monitoring for aerospace structures. In: Proceedings of the 9th International Workshop on Structural Health Monitoring. Stanford, CA, USA: DEStech Publications, Inc.; 2013, pp. 2123–2129.

159.

Memmolo

Ricci

Maio

, et al. Model assisted probability of detection for a guided waves based SHM technique. In: Kundu

Las Vegas. NV, p. 980504.

160.

Foucher

Fernandez

Leberre

, et al. New Tools in CIVA for Model Assisted Probability of Detection (MAPOD) to Support NDE Reliability Studies, 2018, p. 12.

161.

CIVA, https://www.extende.com/(accessed 6 October 2021).

162.

Foucher

Lonne

Toullelan

, et al. An overview of validation campaigns of the CIVA simulation software, 9.

163.

Spencer

Thompson

Chimenti

. Nonparametric POD Estimation for Hit/Miss Data: A Goodness of Fit Comparison for Parametric Models. San Diego, CA, pp. 1557–1564.

164.

Dominguez

Reboud

Dubois

, et al. A New Approach of Confidence in POD Determination Using Simulation. Denver, CO, pp. 1749–1756.

165.

Mahadevan

Rebba

. Validation of reliability computational models using Bayes networks. Reliability Eng Syst Saf 2005; 87: 223–232.

166.

Meeker

Escobar

. Introduction to the Use of Bayesian Methods for Reliability Data. In: Statistical methods for reliability data. New York, NY: Wiley; 1998, pp. 343–368.

167.

Thompson

Diaz

Shull

, et al. (eds). NDE simulations: critical tools in the integration of NDE and SHM. USA: San DiegoCalifornia, p. 729402.

168.

Jenson

Dominguez

Willaume

, et al. A Bayesian Approach for the Determination of POD Curves from Empirical Data Merged with Simulation Results. Denver, CO, pp. 1741–1748.

169.

Syed Akbar Ali

Kumar

Rao

, et al. Bayesian synthesis for simulation-based generation of probability of detection (PoD) curves. Ultrasonics 2018; 84: 210–222.

170.

Forsyth

Leemans

. Bayesian Approaches to Using FIeld Inspection Data in Determining the Probability of Detection, 62.

171.

Kanzler

Mueller

Ewert

, et al. Bayesian Approach for the Evaluation of the Reliability of Non-Destructive Testing Methods. In: 18th World Conference on Nondestructive Testing. Durban, South Africa: NDT.net, p. 6.

172.

Thompson

Chimenti

. A Bayesian Approach to the Inversion of NDE and SHM Data. Kingston (Rhode Island), pp. 679–686.

173.

Ameyaw

Rothe

Söffker

. Probability of Detection (POD)-Oriented View to Fault Diagnosis for Reliability Assessment of FDI Approaches. 30th Conference on Mechanical Vibration and Noise. Quebec City, Volume 8. Quebec, Canada: American Society of Mechanical Engineers (ASME), p. V008T10A041.. In:

174.

Ameyaw

Rothe

Söffker

. Adaptation and Implementation of Probability of Detection (POD)-based Fault diagnosis in elastic structures through vibration-based SHM approach. In: 9th European Workshop on Structural Health Monitoring. Manchester, UK: NDT.net, https://www.ndt.net/article/ewshm2018/papers/0029-Ameyaw.pdf (2018, accessed 12 September 2020).

175.

Hall

Llinas

. An introduction to multisensor data fusion. Proc IEEE 1997; 85: 6–23.

176.

Kralovec

Schagerl

. Review of structural health monitoring methods regarding a multi-sensor approach for damage assessment of metal and composite structures. Sensors 2020; 20: 826.

177.

Eleftheroglou

Zarouchas

Loutas

, et al. Structural health monitoring data fusion for in-situ life prognosis of composite structures. Reliability Eng Syst Saf 2018; 178: 40–54.

178.

Gagar

Irving

Jennions

, et al. Development of probability of detection data for structural health monitoring damage detection techniques based on acoustic emission. In: Proceedings of the 8th International Workshop on Structural Heath Monitoring. Stanford, CA; 2011, pp. 1391–1398.

179.

Flynn

Todd

Wilcox

, et al. Maximum-likelihood estimation of damage location in guided-wave structural health monitoring. Proc R Soc A: Math Phys Eng Sci 2011; 467: 2575–2596.

180.

Moriot

Quaegebeur

Le Duff

, et al. Characterization of the robustness of SHM imaging techniques using the absolute error of localization. In: Proceedings of the 8th European workshop on structural health monitoring. Bilbao, https://www.ndt.net/events/EWSHM2016/app/content/Paper/279_Moriot.pdf (2016, accessed 12 April 2020).

181.

Moriot

Quaegebeur

Le Duff

, et al. A model-based approach for statistical assessment of detection and localization performance of guided wave-based imaging techniques. Struct Health Monit 2018; 17: 1460–1472.

182.

Yue

Aliabadi

. Hierarchical approach for uncertainty quantification and reliability assessment of guided wave-based structural health monitoring. Struct Health Monit 2020: 147592172094064.

183.

ISO 5725-1:1994 Accuracy. (Trueness and Precision) of Measurement Methods and Results — Part 1: General Principles and Definitions.

184.

ISO 5725-2:2019 Accuracy. (Trueness and precision) of measurement methods and results — Part 2: Basic method for the determination of repeatability and reproducibility of a standard measurement method.

185.

Visser

. Great Britain, Health and Safety Executive. POD/POS Curves for Non-destructive Examination . Sudbury: HSE Books, 2002.

186.

Førli

. Nordtest Report. NT TECHN REPORT 394. Nordtest Report, 1998.

187.

Ducharme

Rigault

Strijdonk

, et al. Automated Ultrasonic Phased Array Inspection of Fatigue Sensitive Riser Girth Welds with a Weld Overlay Layer of Corrosive Resistant Alloy. NDT.net, 2012, https://www.ndt.net/article/ndtnet/2012/1_Ducharme.pdf.

188.

Spencer

Todorov

White

, et al. Advanced Technologies and Methodology for Automated Ultrasonic Testing Systems Quantification. Washington, DC: EWI Project No. 50454GTHU.S. Department of Transportation Pipeline and Hazardous Materials Safety Administration, https://primis.phmsa.dot.gov/matrix/FilGet.rdm?fil=6729&s=E95FE952E286468B9FCB6FDA258E65E3 accessed 29 April 2011).

189.

Nath

Balasubramaniam

Krishnamurthy

, et al. Reliability assessment of manual ultrasonic time of flight diffraction (TOFD) inspection for complex geometry components. NDT E Int 2010; 43: 152–162.

190.

Nath

. Effect of variation in signal amplitude and transit time on reliability analysis of ultrasonic time of flight diffraction characterization of vertical and inclined cracks. Ultrasonics 2014; 54: 938–952.

191.

Barrett

Smith

Modarres

. A multivariate model to assess the probability of detection and sizing of defects in aluminum panels using eddy current inspections. Eng Fail Anal 2018; 94: 182–194.

192.

Schneider

CRA

Rudlin

. Review of statistical methods used in quantifying NDT reliability. Insight - Non-Destructive Test Condition Monit 2004; 46: 77–79.

193.

Ermolov

. The reflection of ultrasonic waves from targets of simple geometry. Non-Destructive Test 1972; 5: 87–91.

194.

Ginzel

Kanters

. Ermolov Sizing Equations Revisited. NDT.net; 7, https://www.ndt.net/article/v07n01/ginzel/ginzel.htm (2002, accessed 2 March 2021).

195.

Nath

. Estimates of Probability of Detection and Sizing of Flaws in Ultrasonic Time of Flight Diffraction Inspections for Complex Geometry Components With Grooved Surfaces. J Nondestructive Eval Diagn Prognostics Eng Syst 2021; 4: 021003.

196.

Beard

Liu

C-C

Chang

F-K

. . In: Davis

Henderson

McMickell

(eds). Design of a robust SHM system for composite structures. San Diego, CA, p. 652709.

197.

Brennan

de Leeuw

. The Use of Inspection and Monitoring Reliability Information in Criticality and Defect Assessments of Ship and Offshore Structures. Structures, Safety and Reliability, Volume 2. Portugal: EstorilASMEDC, pp. 921–925.. In:

198.

Aldrin

Annis

Sabbagh

, et al. Case Study on NDE Characterization Metrics for Optimization, Validation and Quality Control. Idaho: Boise, pp. 845–855.

199.

Saeedifar

Zarouchas

. Damage characterization of laminated composites using acoustic emission: A review. Composites B: Eng 2020; 195: 108039.

200.

Di Sante

. Fibre optic sensors for structural health monitoring of aircraft composite structures: recent advances and applications. Sensors 2015; 15: 18666–18713.

201.

Servais

Ibarra-Castenado

Maldague

, et al. Probability of detection for in field thermal non destructive testing of aircraft composite structures. In: Proceedings of the 2010 International Conference on Quantitative InfraRed Thermography. QIRT Council; 2010. Epub ahead of print 2010. DOI: 10.21611/qirt.2010.124.

202.

Duan

Servais

Genest

, et al. ThermoPoD: A reliability study on active infrared thermography for the inspection of composite materials. J Mech Sci Tech 2012; 26: 1985–1991.

203.

Kurz

Jüngert

Dugan

, et al. Reliability considerations of NDT by probability of detection (POD) determination using ultrasound phased array. Eng Fail Anal 2013; 35: 609–617.

204.

Junyan

Yang

Fei

, et al. Study on probability of detection (POD) determination using lock-in thermography for nondestructive inspection (NDI) of CFRP composite materials. Infrared Phys Tech 2015; 71: 448–456.

205.

Roach

. Use of Comparative Vacuum Monitoring Sensors for Automated, Wireless Health Monitoring of Bridges and Infrastructure. In: Proceedings of the 9th International Conference on Bridge Maintenance, Safety and Management. https://www.osti.gov/servlets/purl/1525921 (2018, accessed 12 October 2020).

206.

Heinlein

Cawley

Vogt

. Validation of a procedure for the evaluation of the performance of an installed structural health monitoring system. Struct Health Monit 2019; 18: 1557–1568.

207.

Calmon

Chapuis

Jenson

, et al. The Use of Simulation in POD Curves Estimation: An Overview of the IIW Best Practices Proposal, 7.

208.

Huthwaite

. Accelerated finite element elastodynamic simulations using the GPU. J Comput Phys 2014; 257: 687–707.

Probability of detection,localization,and sizing: The evolution of reliability metrics in Structural Health Monitoring

Abstract

Keywords

Introduction

Reliability assessment in non-destructive evaluation

The detection problem

General considerations on probability of detection curves

â versus a method

From regression to probability of detection curve

Choosing the right threshold

Probability of detection curve bounds

Binary (Hit/Miss) data

Variability sources in structural health monitoring

Spatial aspects of probability of detection

Optimal sensor placement using probability of detection

Specimens versus test structures

Decision threshold for structural health monitoring systems

Temporal aspects of probability of detection

Sequential data analysis

Probability of detection for structural health monitoring

The length at detection method

Linear mixed-effect model

Random effects model

Comparison between length at detection and random effects model methods

Multivariate-probability of detection

Model-assisted probability of detection

Metamodels

Bayesian methods

Fusion of probability of detection curves

Localization and sizing metrics

Probability of localization

Probability of sizing

Discussion and perspectives

Footnotes

Declaration of conflicting interests

Funding

ORCID iD

Appendix

References