Abstract
Prognostics and health management (PHM) is becoming increasingly important as engineering structures and systems grow more complex. Many of these systems lack accurate physical models to describe their degradation, especially in unpredictable scenarios. To meet safety regulations, robust prognostic models are needed to transform sensor data into reliable predictions about a system’s remaining useful life (RUL). This study presents the adaptive hidden semi-Markov model (AHSMM), a novel probabilistic approach that enhances RUL prediction accuracy, uncertainty quantification (UQ), and reliability assessment compared to a long short-term memory (LSTM) model. A key contribution is an in-house experimental campaign involving glass fiber-reinforced polymer specimens subjected to fatigue loading and multiple impact events at different locations and time intervals. Unlike traditional models that rely on data from similar damage histories, the AHSMM is trained exclusively on unimpacted specimens and tested on multiply impacted ones, showcasing its adaptability to previously unseen conditions. The study also introduces a new prognostic performance measure tailored to AHSMM and develops a conditional reliability analysis for both AHSMM and LSTM predictions. Results demonstrate that AHSMM consistently outperforms LSTM across all evaluation metrics. It achieves a 24% lower RMSE over the full lifetime and superior UQ, with an average coverage of 0.79 compared to 0.17 for LSTM. Conditional reliability analysis further shows that AHSMM provides more accurate and stable reliability estimates as data accumulates. By capturing the degradation process and adapting to evolving conditions, AHSMM strengthens prognostic robustness. This study highlights the need for robust PHM models that can handle real-world uncertainties and contribute to advancements in the aerospace, automotive, and defense industries.
Introduction
Prognostics and health management (PHM) play a crucial role in ensuring operational safety, improving the reliability and availability of engineering assets, and reducing maintenance costs. A critical parameter in achieving these objectives is the prediction of remaining useful life (RUL), which is an output of the prognostics models. 1 Prognostics models are typically classified into three main types: physics-based, data-driven, and hybrid models.2,3
Physics-based models describe the underlying degradation process using mathematical equations. 4 However, these models often face limitations, as they tend to assume specific operational conditions and are typically applicable only to simple systems or components. In practice, most engineering systems and structures are much more complex, and physical models may not be available. 5
Data-driven models, on the other hand, use historical degradation data from similar systems to make predictions. 6 Supervised data-driven models, such as deep learning (DL) models, including long short-term memory (LSTM) networks and convolutional neural networks, have demonstrated high accuracy in various applications. However, these models are highly sensitive to the quality and quantity of training data, as well as to operational variability, unexpected phenomena, and environmental uncertainties common in real-world scenarios. Such challenges can lead to poor generalization and unreliable predictions.7–9
In contrast, unsupervised data-driven models, such as stochastic models and Bayesian filters, offer a promising alternative for robust prognostics, particularly when labeled failure data are scarce or unavailable.10,11 Additionally, stochastic models and Bayesian filters inherently provide uncertainty quantification (UQ), whereas UQ must be introduced separately in DL models through various mechanisms. 12 The probabilistic nature of stochastic models and Bayesian filters allows for RUL predictions with associated confidence levels.
Hybrid models seek to combine the strengths of both physics-based and data-driven approaches, aiming to improve prediction performance while reducing computational complexity. However, the practical implementation of these models remains challenging, limiting their widespread adoption. 13
Due to the limitations of physics-based models, especially for complex systems, PHM has increasingly shifted its focus to data-driven approaches. In structural health monitoring (SHM), for example, data are typically collected using nondestructive testing techniques or networks of permanently attached sensors that gather real-time information about the degradation process.14,15 However, the degradation of a structural component is heavily influenced by operational conditions such as environmental factors and load variations. Since it is impossible to account for all potential conditions during training, it becomes crucial to develop robust prognostics models that can maintain their performance even under unforeseen operational or environmental conditions. 16
In industries such as aerospace, automotive, and defense, composite materials are widely used due to their desirable specific strength and stiffness, but they also present unique challenges for prognostics. These materials are not only vulnerable to impact damage, which can significantly reduce their load-bearing capacity, but their behavior is also complex and difficult to model, with no readily available physics-based models. Though considerable research has focused on diagnostic techniques to identify damage mechanisms,17,18 fewer studies have directly addressed RUL prediction, particularly under realistic operational conditions. Most existing literature focuses on residual strength or damage progression estimation,19–23 leaving a gap in RUL prediction research, which is critical for effective maintenance planning and mission operations.
Several studies have explored data-driven prognostics for composites under controlled conditions, where training and testing data are similar. For example, nonhomogeneous hidden semi-Markov models (NHHSMMs) have been used successfully to predict the RUL of composite structures based on acoustic emission (AE) 24 and Digital image correlation (DIC) 25 data from Carbon-fiber reinforced polymer (CFRP) open-hole coupons. Similarly, Gaussian process regression (GPR) has been employed for RUL prediction using health indicators (HI) derived from AE and Lamb waves, yielding reliable results under fixed operational conditions.26,27
While these studies provide valuable insights, their applicability in real-world scenarios, where operational conditions often vary, is limited. A few studies have attempted to predict RUL under variable conditions. For instance, a physics-based Particle Filter model was used to estimate the RUL of composites subjected to impact tests, based on electromechanical behavior. 19 However, this model is limited to monitoring electrical resistance in CFRP materials and does not account for fatigue loading, a key factor in load-bearing applications. Other research efforts have introduced Bayesian frameworks that correlate stiffness degradation with RUL,28,29 but these models focus mainly on fatigue life estimation rather than addressing variable operational conditions. In the study by Cheng et al., 30 a progressive damage model for residual strength and fatigue life of impacted glass fiber-reinforced polymer (GFRP) composite laminates after different loading conditions was developed. It was observed that the impacts had a greater effect on the fatigue strength of the composites when under compressive loads and that the loading sequence significantly affects damage evolution. Similarly, in the study by Zhao et al., 31 a model to predict the fatigue strength after impact of composite coupons was used. The model showed an average error of approximately 15%. The model parameters were fitted from the experimental data, and the only monitoring performed was the size of the impact damage via C-scan. These studies did not rely on experimental parameters, and no SHM systems were employed, making it less appealing for more complex systems where failure data may be unavailable. On the other hand, in the study by Zarouchas et al., 32 in situ high-speed impacts were performed on CFRP open hole coupons during tensile fatigue loading, and AE and DIC were used to monitor the behavior. Even though the two monitoring systems efficiently monitored the degradation process, no predictions of the fatigue life or strength were performed.
In recent years, there have been efforts to enhance the robustness of data-driven prognostics models under varying operational conditions and unexpected events. For example, a study examined RUL prediction of single stiffened composite panels using AE-based HIs, comparing the performance of GPR and Bayesian neural networks. Both models exhibited similar results, with GPR showing faster training times. 33 The same study also explored the use of strain-based HIs and GPR for predicting RUL, 34 while a similarity learning hidden semi-Markov model (HSMM) was employed to improve performance. 35 A further study was made by attempting to upscale their methodology in multistiffened composite panels using GPR and LSTM. 36 Their performance was only average, and it was noted that regression models are reliant on the input data. Another study introduced an adaptive NHHSMM, trained on open-hole composite specimens subjected to fatigue loading and tested on specimens that experienced an impact during fatigue loading. 37 These studies highlight the importance of prognostic models capable of modeling the degradation process rather than regressing on the SHM data.
The need for robust prognostics in composite materials is particularly urgent for two reasons. First, composites are vulnerable to impact damage, which can severely compromise their load-bearing capacity. Second, composites are susceptible to impact fatigue, which occurs when multiple impacts further degrade their structural integrity. While impact fatigue has been extensively reviewed, 38 it is often studied separately from fatigue loading, even though these two types of fatigue typically occur together, exacerbating the operational challenges of composite structures. The combined effect of these factors is further amplified when operational conditions vary, underscoring the importance of accurately estimating their impact on RUL.
This article presents a methodology for robust prognostic modeling of composite structures, focusing on model performance under complex and unpredictable scenarios. The adaptive HSMM (AHSMM) is proposed, incorporating an adaptation mechanism similar to that in the study by Eleftheroglou et al. 37 Unlike prior studies that considered only single-impact events, this work introduces an experimental campaign involving multiple impacts applied at different locations during fatigue loading. This setup better reflects operational conditions in defense and aerospace applications, where systems are exposed to repeated and varied impact events.
To capture these events, SHM sensors are employed. From the sensor data, statistical features are extracted to derive a degradation signal, which is then used as input to the AHSMM for predicting the RUL of multiply impacted composite coupons. The complete framework is illustrated in Figure 1.

Prognostic framework using SHM data and AHSMM for RUL prediction of multiimpacted composite structures. SHM: structural health monitoring; RUL: remaining useful life; AHSMM: adaptive hidden semi-Markov model.
The data division strategy trains the model on unimpacted specimens and evaluates it on multiply impacted ones. This approach is crucial for assessing the model’s ability to adapt to previously unseen damage conditions.
Accordingly, this study makes three key contributions:
A new prognostic performance measure tailored to the AHSMM model.
A new case study for prognostics through an experimental campaign with multiple impacts in different locations in GFRP composite structures.
A conditional reliability analysis for direct evaluation of RUL prediction uncertainty from both AHSMM and LSTM models.
The remainder of the article is structured as follows: “Methodologies” section presents the basic principles of the methodologies employed for the RUL estimations and conditional reliability, and “Case study: unexpected impact events” section describes the case study that is used to evaluate the methodologies. “Results and discussion” section is split into two sections, presenting the main results of the article and a discussion about the need for robust prognostics. Finally, the conclusions are presented in “Conclusions” section.
Methodologies
This section outlines the methodology used to develop the AHSMM model and the LSTM model, which is used as a comparison point for the RUL and conditional reliability estimations. The AHSMM is an adaptive extension of the well-established hidden semi-Markov model (HSMM), 39 while the adaptive extension of the model is based on the methodology used in the study by Eleftheroglou et al. 37 The LSTM is based on the architecture presented in the study by Asif et al. 40
Adaptive hidden semi-Markov model
The HSMM is an extension of the hidden Markov model (HMM) 41 with the inclusion of a variable sojourn time for each damage state. For each damage state, a number of observations are emitted depending on the sojourn time of the given damage state. For this study, the sojourn time of each damage state is given by a Weibull distribution. The parameters of the HSMM are described below:
N: number of damage states. The set of damage states is denoted as
M: number of distinct observation symbols. The observation process is modeled with a Gaussian distribution, therefore, the space consists of all the real numbers.
λ: transition rate function.It describes the degradation process that follows a rate function, denoted by λ. For this study, the λ function corresponds to the Weibull distribution, which is commonly used in reliability theory.
To train the HSMM, the Expectation–Maximization algorithm is used. Details of this procedure can be found in the study by Shun-Zheng et al. 42 However, some assumptions are made to use this model for prognostics.
Initial state: the model starts in the first damage state. Therefore, the engineering system is assumed to always start as good as new.
Transitions: only left-to-right transitions are allowed. This means that the model can only transit to a neighboring state or stay in the current state. Therefore, it is assumed that the engineering system does not recover health and that it has to go through all the states before reaching failure.
Final state: the final state of the HSMM is observable, and it represents failure. In the final state, only one observation value is emitted.
Finally, Figure 2 presents a representation of the HSMM incorporating the assumptions made for prognostics.

HSMM representation. HSMM: hidden semi-Markov model.
As previously mentioned, the model AHSMM is an adaptive extension of the HSMM. The overall adaptation process is illustrated as a flowchart in Figure 3. During the testing phase, the adaptation is triggered once a transition from damage state

Flowchart of the adaptation process.
A ratio between
For RUL prediction, a time-dependent prognostic measure is used, following the framework in the study by.39,43 This formulation provides a probability distribution for the RUL that evolves with time spent in the current damage state.
The variables in Equations (4)–(6) are defined as follows:
τ: Time already spent in the current state i; thus,
The resulting RUL expression gives a probability distribution per time step, and 95% confidence intervals are computed using the cumulative distribution function (CDF). Since the AHSMM is a stochastic model, the uncertainty reflected in the pdf corresponds to the aleatoric uncertainty with the added uncertainty propagation that comes from the prognostic measure. Aleatoric uncertainty refers to the inherent variability or randomness in a system or process that cannot be reduced, even with more information or data:
Conditional reliability is defined as the probability that the system remains operational at a future time
Long short-term memory
The LSTM has been widely used in prognostics, given its high accuracy in predicting RUL, since it can learn temporal features in time-series data. The LSTM is a supervised model, and it maps the HI values to the labels (regression), which are RUL values in the prognostics case.
Figure 4 presents the architecture used for this article. The architecture only uses two LSTM layers, given the low amount of data available to train, and that only one feature is available. The model’s hyperparameters are shown in Table 1 and are chosen based on a random search using Keras Tuner. The input for the LSTM consists of windows of 10 samples. This choice is driven by the need for the LSTM to receive inputs of consistent length and to be utilized in an online manner, necessitating the data to be windowed; therefore, the dense layer needs to have 10 units.

LSTM architecture for prognostics. LSTM: long short-term memory.
LSTM hyperparameters.
MSE, mean squared error.
To account for uncertainty, the Monte Carlo (MC) dropout technique is utilized, which accounts for epistemic uncertainty. 44 Epistemic uncertainty is the uncertainty that arises from a lack of knowledge or information about a system or process, and can potentially be reduced with more data or better understanding. For this study, the number of forward passes to account for uncertainty equals 100 since it is the standard number in the literature. Similar to AHSMM, the conditional reliability for the LSTM is calculated in the same way as shown in Equation (7). It is worth noting that the pdf of RUL for the LSTM is defined as a Gaussian distribution determined from the samples taken with the MC dropout technique.
Case study: unexpected impact events
The proposed model is evaluated in a case study involving (multiple) impacts during constant amplitude tensile fatigue tests. An experimental campaign is conducted on GFRP specimens subjected to both fatigue loading and multiple impacts. The case study focuses on creating unexpected events that can occur during operation and evaluates the capabilities of the model to handle short-term unexpected events similar to the study by Eleftheroglou et al. 37
Specimen specifications and experiment definition
GFRP coupons were manufactured from SL90–T0/EGL, with a nominal length of 400 mm and a width of 45 mm. An eight-ply quasi-isotropic layup of

GFRP coupon depiction and dimensions. GFRP: glass fiber-reinforced polymer.
The fatigue tests comprised constant amplitude tension-tension fatigue where the load was fixed at 50% of the average tensile failure load, the load ratio at 0.1, and the frequency at 4 Hz. The tests were run in an MTS hydraulic machine with a 250 kN load capacity. The fatigue was interrupted every 500 cycles, for 2 s, allowing pictures to be taken at the maximum load to inspect damage progression and failure during postprocessing visually. Additionally, the entirety of the fatigue loading is monitored using Vallen VS900-M AE sensors, paired with a four-channel AMSY-6 acquisition system and an external pre-amplifier of 34 dB. To reduce the external noise and hits not associated with degradation, a 50-dB threshold was set.
Two types of tests were conducted: fatigue testing (until failure) of unimpacted specimens, and fatigue testing with in situ impacts at various time intervals, while the specimens were loaded with the mean value of the fatigue loading. Impacts were performed via an in-house gas gun, which is shown in Figure 6. As shown in Figure 7, the hemispherical aluminum projectile is attached to a plastic cylinder that matches the barrel’s radius. The length of the projectile (including the aluminum tip) was 35 mm, and the diameter was 25 mm. The total weight of the projectile is approximately 16 g. The purpose of this shape and design is to reduce spinning effects inside the barrel and ensure a precise impact location. The impact pressure ranged from 1.2 to 1.5 bar. To protect the AE sensors, they were removed during the impact and re-attached before restarting the fatigue.

Impact gas gun setup.

Projectile.
The goal of the impact(s) is to simulate unexpected events during fatigue testing, which can commonly occur during the operation of a structure. These events affect structural integrity and the degradation process, resulting in changes in the fatigue life. Additionally, the impacts present a challenge that is used to evaluate the robustness of the models and their ability to adapt their predictions to the altered degradation process. The fatigue life of unimpacted specimens is provided in Table 2, while Table 3 specifies the impact conditions and reports the lifetimes for impacted specimens. Failure time is derived during postprocessing; thus, lifetimes are reported as the time the picture is captured. The total collapse was the result of the specimens snapping in two, while from visual inspection, failure is considered at the time separation initiates, which is accompanied by a significant stiffness drop observed from the load-displacement data.
Unimpacted specimen fatigue lifetime.
Impacted specimen information and lifetime.
Impacts were performed at the location of the hole unless specified otherwise. In Table 3, simultaneous impacts are denoted with “×” (e.g., “10k × 2” indicates two consecutive impacts at cycle 10,000), whereas sequential impacts are separated by commas (e.g., “5k, 10k, 15k” represent impacts occurring at 5000, 10,000, and 15,000 cycles, respectively). The total cycles column reflects the lifetime of each specimen under the specified conditions.
As observed in Tables 2 and 3, impact damage does not necessarily reduce fatigue life; in some cases, impacted specimens outlast unimpacted ones. This outcome may result from stress redistribution around the drilled hole in the quasi-isotropic layup. Additionally, impact-induced delaminations or fiber breakage could modify local stress fields, potentially reducing stress concentrations.
Data preprocessing and feature extraction:
The AE data collected during the tests were used to extract informative features for the RUL estimation. In the first step the low-level AE features, including amplitude, duration, rise time, energy, counts, and so on, are windowed using a nonoverlapping 500-cycle window since it has been demonstrated that cumulative features are not always representative of the degradation and suffer from poor prognosability (dependent on running time).
25
The window size was chosen arbitrarily based on the cycles that the photos were taken. In the second step, since the low-level features are not sensitive enough to degradation, statistical features were calculated for each window in both the time and frequency domains in an attempt to identify degradation-sensitive features.45,46 The mathematical formulation of these statistical features is demonstrated in Table 4, where
Statistical features.
Apart from the statistical features in the time and frequency domains, principal component analysis (PCA) is performed in both domains, including all primary (low-level AE features recorded by the AE system) and secondary features (statistical features extracted in the previous step). The PCA model, similar to the study by Loukopoulos et al., 47 is created using only a healthy subset of the data, which consists of the time of the first impact or 10,000 cycles if no impact is performed (chosen arbitrarily), and then the entire dataset is transformed into the principal component space. The cumulative squared reconstruction residual (Q) and Hotelling’s T 2 are calculated and, to improve prognosability, the natural logarithm is applied.
The suitability of the extracted features was evaluated using the sum of Spearman rank monotonicity (Equation (8)), prognosability (Equation (9)), and trendability (Equation (10)). 48
As demonstrated in Figure 8, the raw features (amplitude, duration, frequency, and rise-time/amplitude) consistently exhibit poor performance across all evaluation metrics. Among the extracted features,

Comparison of raw and top two best features based on monotonicity, trendability, and prognosability metrics.
It is also worth highlighting that when compared to prognostic feature optimization studies,34,46,49 the prognostic feature shows only average performance in the combined metrics. Purposefully, further investigation is not performed to demonstrate that the proposed prognostic model is capable of providing good results even with average prognostic features, and optimization of the degradation feature is outside the scope of the article.
The behavior of

Degradation histories of train and test set using feature
In addition, Figure 10 shows the degradation histories of selected test specimens, along with the corresponding times of impact. Impact times are indicated with circles, and a larger circle denotes that two impacts occurred within the same cycle. For better visual clarity, only four representative degradation histories are shown. It can be observed that the impact event does not necessarily alter the behavior of the degradation feature, providing a clear indication of damage initiation.

Test degradation histories with marked times of impacts.
Results and discussion
The proposed AHSMM was trained exclusively on data from specimens that were not subjected to in situ impacts. As a result, unexpected impact events were excluded from the training process and only introduced during testing. The same training strategy was applied to the LSTM model, which is used here for performance comparison. A Q-Q plot of the RUL predictions from both models is shown in Figure 11. The plot reveals that both models exhibit reduced accuracy during the early life cycles. However, as the specimens approach end-of-life, the AHSMM aligns more closely with the ideal distribution than the LSTM, highlighting the benefits of its adaptiveness. Given this performance distinction, only the RUL predictions from the AHSMM are presented in the following analysis for clarity.

Q-Q plot of AHSMM and LSTM predictions for all test samples. AHSMM: adaptive hidden semi-Markov model; LSTM: long short-term memory.
To illustrate the model’s performance, three RUL plots are presented in Figure 12, corresponding to the specimens with the shortest, average, and longest fatigue life within the test set. The comprehensive summary of results across all test specimens is provided in Tables 5, 6, and 7.

RUL predictions with the AHSMM: (a) left outlier, (b) inlier, and (c) right outlier. (a) RUL prediction for the test specimen with the shortest lifetime (SP-19). (b) RUL prediction for the test specimen with average lifetime (SP-16). (c) RUL prediction for the test specimen with the longest lifetime (SP-24). RUL: remaining useful life; AHSMM: adaptive hidden semi-Markov model.
RMSE value comparison between AHSMM and LSTM.
RMSE value comparison between AHSMM and LSTM for the second half of the lifetime.
Coverage comparison between AHSMM and LSTM.
Unexpected events typically, but not always, lead to a reduction in the overall fatigue life of a specimen. This effect is particularly evident in Figure 12(a), which corresponds to the specimen with the shortest lifetime. Initially, the model significantly overestimates the true RUL; however, after adaptation, the AHSMM refines its predictions to accurately estimate the true RUL. Meanwhile, the specimen with the longest lifetime (Figure 12(c)) shows consistency between the predicted and actual RUL, as its lifespan falls within the range learned from the training data. Similarly, the predicted RUL for the specimen with an average lifetime (Figure 12(b)) aligns closely with the true RUL during the latter part of its life cycle.
Two key evaluation metrics are used: Root mean squared error (RMSE) and coverage. RMSE quantifies the error between the true and predicted mean RUL, while coverage assesses UQ by measuring the proportion of true RUL values that fall within the predicted confidence intervals. Specifically, if
Table 5 compares RMSE values between AHSMM and LSTM. The AHSMM consistently achieves lower RMSE values across all tested specimens, expect for SP-21, demonstrating its superior predictive performance. The adaptive module within AHSMM significantly enhances accuracy as new data become available. In particular, when evaluating performance over the second half of the lifetime of each specimen (Table 6), the AHSMM shows a substantial reduction in RMSE, highlighting the model’s capability to refine its estimates as more degradation data is collected.
Coverage results shown in Table 7, further emphasize the robustness of AHSMM. The AHSMM achieves an average coverage of 0.79, significantly outperforming the LSTM, which only achieves 0.17. This indicates that the AHSMM provides more reliable UQ, ensuring that the true RUL values are better captured within the confidence intervals.
To further analyze model performance from a conditional reliability perspective, Figure 13 presents reliability comparisons between AHSMM and LSTM at different time steps. At the beginning of the degradation cycle (Figure 13(a)), the conditional reliability estimates for both models exhibit higher uncertainty. However, as more data become available, AHSMM adapts effectively, providing more precise reliability estimates, as shown in Figure 13(b) and (c). In contrast, LSTM struggles to maintain accurate reliability estimates.

Conditional reliability comparison between AHSMM and LSTM for SP-17. (a) Conditional reliability at t = 0 cycles. (b) Conditional reliability at t = 30,000 cycles. (c) Conditional reliability at t = 45,000 cycles. AHSMM: adaptive hidden semi-Markov model; LSTM: long short-term memory.
Discussions
This study underscores the need for robust prognostic models capable of handling unexpected operational conditions. The results demonstrate that state-of-the-art machine learning models like LSTM have a significant decrease in their prognostic performance when unexpected events occur, particularly for specimens with significantly shorter lifetimes than those observed in the training set. This limitation is inherent in models that rely heavily on training data distributions, which may not adequately account for rare or unexpected events.
The need for robust and adaptive prognostic models becomes even more apparent in applications operating under harsh and dynamic environments, such as aerospace or defense applications. For example, civil aircraft typically operate under predefined conditions for each flight cycle, but unexpected weather phenomena or bird strikes can alter these conditions, requiring real-time adaptation of RUL predictions to enhance condition-based maintenance planning. This requirement is even more critical for military aircraft, where rapid maneuvering and unpredictable conditions highlight the need for real-time awareness of structural health and mission readiness.
To establish a robust framework for PHM, adaptive models such as AHSMM are essential. A crucial component of such a framework is UQ, which is categorized into aleatoric and epistemic uncertainty. Aleatoric uncertainty stems from the inherent randomness in data, while epistemic uncertainty arises due to a lack of knowledge or model limitations. 50 As discussed in “Methodologies” section, AHSMM effectively captures and propagates aleatoric uncertainty, whereas LSTM models address only epistemic uncertainty using techniques such as MC Dropout.
Results from AHSMM demonstrate that the confidence intervals generated by the model generally encompass the true RUL, although with wide intervals. This aligns with findings in other stochastic prognostic models, highlighting the importance of uncertainty management. Effective uncertainty management 51 can minimize the impact of uncertainties on RUL predictions, thereby improving decision-making robustness. Future research should explore advanced techniques for uncertainty management to refine confidence intervals and enhance the accuracy of RUL predictions, ultimately strengthening the robustness of PHM frameworks.
Conclusions
In this article, an AHSMM is proposed to address the challenging task of RUL prediction of composite structures subjected unexpected impact events. A case study was launched using GFRP open hole coupons subjected to tensile fatigue loading. The model was trained on data from unimpacted specimens and evaluated on cases involving multiple in situ impacts. This approach allowed for a realistic assessment of the model’s adaptability to unforeseen damage scenarios, addressing a critical gap in prognostics research.
The AHSMM successfully predicted the RUL of impacted specimens with an RMSE of 17k cycles. Notably, the RMSE improved to 12k cycles when evaluating the second half of the specimens’ lifetimes. This reduction underscores the model’s capacity to refine its predictions as additional data become available. In contrast, the LSTM model showed a higher RMSE of 23k cycles, which decreased to 20k cycles when focusing on the second half of the fatigue life. From a UQ perspective, the AHSMM outperformed the LSTM model by achieving higher coverage. This highlights that the AHSMM not only enhances prediction accuracy but also offers more reliable confidence intervals.
The comparative analysis between AHSMM and LSTM demonstrated the advantages of an adaptive approach in capturing the degradation process beyond simple regression on SHM data. While LSTM models showed sensitivity to data variations and exhibited limited generalization, AHSMM provided more consistent RUL predictions with well-calibrated reliability estimates. Furthermore, the conditional reliability analysis highlighted the importance of integrating UQ into prognostic models, particularly for decision-making in safety-critical applications.
The findings highlight the necessity of developing robust prognostic models that can support maintenance planning while maintaining sufficient reliability. However, as a data-driven model, AHSMM still relies on the quality of SHM data. Despite its robustness, noisy or highly variable data can significantly impact performance. Another concern is the wide confidence intervals resulting from aleatoric uncertainty, which could lead to conservative maintenance decisions. Since decision-makers often use the lower bound of confidence intervals for planning, managing uncertainty more effectively is essential for improving the model’s practical applicability.
Overall, the results underscore the necessity of developing prognostic models that can adapt to changing operational conditions without requiring extensive retraining. The findings contribute to advancing PHM frameworks for composite materials, particularly in aerospace and other high-reliability industries. Future work will focus on expanding the model’s applicability to more complex structures and operational scenarios, including variable loading conditions and real-time adaptation strategies.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was partially funded by the Netherlands Ministry of Defence through the Exploratory Research Program. The code and full experimental data will not be made publicly available; however, the discretized experimental data can be shared upon request by contacting the corresponding author.
