Abstract
In this work, we demonstrate a data-based Machine Learning (ML) framework that can capture the strain evolution during Shape Memory Alloy (SMA) actuation with a minimum of four state variable inputs.These inputs describe the thermomechanical stress-strain state of the SMA and changes in the stress state during thermal actuation. Furthermore, we identify the physics-based constraints to incorporate partial phase transformation and make predictions beyond the training regime. The ML framework uses a recurrent neural network (RNN) model to capture the nonlinear strain rate variation during phase transformation. Physics-based constraints are introduced in the RNN model to include the activation of thermoelasticity, while unloading away from transformation surfaces within the thermoelastic domain. The framework is trained using experimental thermal cycle responses of a NiTiHf SMA at different stress levels. The framework can then accurately predict actuation responses undergoing complete and partial phase transformations, and also extrapolate responses for new thermal cycling that involve complex thermomechanical loading paths. An uncertainty quantification analysis using ensemble training of the NN model is also presented to show the variability of the framework.
Keywords
1. Introduction
Shape Memory Alloys (SMAs) are unique active materials with the ability to recover from large deformations through solid-solid phase transformation between austenite and martensite phases. The austenite phase is stable at high temperatures, and the martensite phase is stable at lower temperatures. Depending on the stress state, the martensite phase can be present either in a twinned or detwinned state. At zero stress, the martensite is in a twinned state, and at non-zero stress, it is in the detwinned state. Phase transformation can be induced by applying mechanical, thermal or combined thermo-mechanical loads (Lagoudas, 2008). The transformation between austenite and detwinned martensite results in the generation/recovery of inelastic strain, called transformation strain. When the SMA is constrained, the recovery of transformation strain can generate high stress and produce a higher work output compared to other active materials. This makes SMAs ideal for designing compact and lightweight solid-state actuators (Benafan et al., 2014; Jani et al., 2014; Simiriotis et al., 2021; Stroud and Hartl, 2020). SMAs are used in various applications in fields such as biomedical, aerospace, industrial, and wind energy (Antonucci et al., 2021; Balasubramanian et al., 2021; Bansiddhi et al., 2008; Calkins and Mabe, 2010; Costanza and Tata, 2020; Elahinia et al., 2012; Hartl and Lagoudas, 2007; Hernandez et al., 2018; Karakalas et al., 2020; Morgan, 2004; Rajput et al., 2022; Stroud and Hartl, 2020).
SMAs exhibit variety of hysteresis behaviors based on the thermo-mechanical loading path they undergo (Lagoudas, 2008). Shape Memory Effect (SME) is observed when SMA is deformed while in the twinned martensitic phase, which when unloaded and heated above the austenite finish temperature, recovers the original shape with the phase transformation back to the parent austenitic phase. Pseudoelastic behavior in SMAs is observed in the recovery of the strain state with stress-induced phase transformation at temperatures where austenite is stable. An actuation response is produced when SMAs at constant stress are taken through a cooling-heating cycle, resulting in the recovery of transformation strain. Depending on the loading conditions, SMAs can show partial phase transformation behavior when the thermal range is inadequate for a full transformation (Karakalas et al., 2020; Wang et al., 2021a, 2021b; You and Guo, 2022). A complete transformation cycle is obtained when an SMA is cooled below the martensitic finish temperature and heated above the austenitic finish temperature. Such cycles with complete phase transformations are referred to as “major cycle responses” and the responses with partial phase transformations due to insufficient cooling or heating are referred to as “minor cycle responses.”
Predicting the actuation response of SMAs efficiently and accurately is key for designing SMA based actuator and sensors. Especially, the minor cycle responses need a special modeling treatment as they involve a complex history of microstructural state involving mixed phase state. For example, initiating the minor cycle during the cooling branch of the major cycle has a different initial microstructure state with martensite phases are being nucleated in the austenite phase and whereas initiation of the minor loop on the heating branch has opposite microstructure state. The minor cyclic responses are of considerable relevance in the case of aerospace, wind turbines, and control applications where the SMA actuator can undergo partial transformation due to partial reversal of the loading. There have been many works to modeling the partial phase transformation, such as, phenomenological approaches (Alsawalhi and Landis, 2022; Brocca et al., 2002; Chemisky et al., 2011; Karakalas et al., 2019a, 2019b; Karakalas and Lagoudas, 2020; Lagoudas et al., 2006; Müller and Bruhns, 2006; Paiva and Savi, 2006; Savi and Paiva, 2005; Scalet et al., 2021; Wang et al., 2021a), empirical modeling efforts (Adeodato et al., 2022; Gorbet et al., 1998; Khan and Lagoudas, 2002) and those for control schemes (Jayender et al., 2008; Williams and Elahinia, 2008). Advanced phenomenological models have significantly improved the modeling of 1D SMA hysteretic thermomechanical behavior (Auricchio et al., 2007; Brinson, 1993; DeCastro et al., 2007; Shaw and Churchill, 2009; Shirani et al., 2017; Shu et al., 1997; Sittner et al., 2000; Song, 2020), particularly through the use of internal state variables that accurately capture the stress-strain-temperature responses. These models offer strong physical interpretability and have been instrumental in enhancing our understanding of SMA behavior. However, they may still require specific modeling assumptions and difficulty with parameter calibration. In contrast, data-driven Machine Learning (ML) models provide an alternative approach by leveraging experimental data to identify governing relationships without predefined constitutive assumptions. While ML models may not inherently offer the same physical interpretability as phenomenological models, they present advantages in flexibility and adaptability to complex material responses.
Data-driven ML approaches have become popular due to their ability to model complex behaviors with good accuracy and to also provide a computationally cost-effective alternative to the traditional physical models. Methods such as neural networks that can be trained directly using experimental responses can predict the specific material sample behavior efficiently. There has been considerable interest in using neural network-based models for modeling SMA behavior for actuation control applications, where a feed-back neural network model is used in the time-series modeling of SMA actuation in the control algorithm (Asua et al., 2010; Damle et al., 1995; Hmede et al., 2022; Narayanan and Elahinia, 2016; Nikdel and Badamchizadeh, 2015; Song et al., 2003; Tai and Ahn, 2012; Wang and Song, 2014). In these works, the neural network model is continuously trained in-situ from the displacement measurements and the predictions are limited to the SMA actuation in the immediate time step. Neural network based approaches has been proposed for the modeling of hysteresis behavior (Kilicarslan et al., 2011; Li et al., 2017; Mohammadi Nia and Moradi, 2022), actuation (Asua et al., 2010; Narayanan and Elahinia, 2014; Song et al., 2003) and cyclic actuation evolution (Owusu-Danquah et al., 2022) for phase transforming materials. These past studies do not introduce additional physics-based constraints/formulation of phase transforming materials, and as a result are limited to the modeling of hysteresis behavior only in the training data.
The focus in this work is to model actuation responses to predict the extent of actuation displacement which is important for the designing of SMA-based actuators. As a material system, NiTiHf based High Temperature SMAs that are ideal for many actuation applications with high transformation temperatures (Karaca et al., 2014) is chosen. A recurrent static neural network model is proposed to capture the hysteresis response of these materials with a minimal number of state variables. In a static neural network, the learning is only during the training phase, and does not evolve dynamically with a feedback. The training of the model is performed using the major cycle responses, with physics-based constraints at the initiation of minor cycle branches from the major cycle to extend its capabilities to predicting the minor cycle responses. A comparison of results from neural networks with and without such constraints showed a visible improvement in the prediction of minor cycle behavior. Therefore to correctly predict all possible states, using limited data for training, physics laws need to be imposed. Building on the work of Liu et al. (2023), which demonstrated that few as three internal variables are sufficient to describe the evolution of the elasto-viscoplastic behavior using a recurrent neural network, this study validates the finding by simulating the complex hysteretic response of SMAs. Unlike Liu et al. (2023), where internal variables were identified from microscopic responses but lacked physical interpretability, the present work shows that SMA actuation can be accurately learned using the physically measurable state variables—temperature, stress, and macroscopic strain—without explicitly predicting internal variables. Therefore it can be used for real-time predictions using sensor measurements, enhancing practical applicability. The authors have successfully applied the same algorithm to capture magnetic shape memory alloy actuation (Tian et al., 2024) and, in a NiTi SMA torque tube actuation application to evaluate internal stresses in real time.
The paper is structured as follows. In Section 2, a brief theory on SMA actuation behavior explaining the major and minor cycle responses is given. In Section 3, the experimental responses of the chosen NiTiHf SMA are discussed. In Section 4, details of the data based model used in the study are presented. In Section 5, the capability and predictions using the resulting model are discussed. Section 6 presents the key conclusions from the works.
2. SMA actuation behavior
2.1. Major cycle actuation response in SMA
A typical major cycle actuation behavior of SMA is as shown in Figure 1(a). The material is in the pure austenite phase at high temperature, represented by point A in the diagram, and in the pure martensite phase at low temperature, represented by point B in the diagram. The heating and cooling of the SMA between states A and B at a constant stress produces a major cycle actuation response of the SMA. The major cycle actuation responses of the SMA are characterized by transformation temperatures (

Actuation response and the corresponding loading path in a typical phase diagram: (a) representative strain-temperature response and calculation of transformation temperatures using the tangent lines approach and (b) loading path shown on a typical stress-temperature phase diagram for SMAs.
2.2. Minor cycle actuation response in SMA
Minor cycle actuation responses are created due to incomplete heating or cooling, resulting in partial phase transformation in either direction. Referring to the loading path in Figure 1(b), partial phase transformation is when the end states (A), (B) or both lie between the start and finish transformation temperatures in the phase diagram. Figure 2 schematically shows minor cycle responses with respect to a major cycle response. In Figure 2(a), the upper minor cycle response occurs when only the end of heating (“

Different types of minor cycle responses on corresponding major cycle responses: (a) upper and lower minor cycles and (b) inner minor cycle.
3. Experimental investigation
To train and validate the machine learning model, a complete material characterization test series along with additional minor cycle cases were carried out on the selected SMA. Dog-bone shaped specimens were cut from a Ni50.3Ti29.7Hf20 sheet. Thermal cycles were applied to the material to induce complete phase transformation under various constant stress levels. The experiments were conducted using an MTS Insight testing machine equipped with an MTS load cell with a 30 kN load capacity. The strain measurements were carried out using a high-temperature Epsilon extensometer of 1-inch gage length attached to the specimen. The temperature measurements were made using a K-type thermocouple attached to the specimen. The specimen along with the grips were enclosed inside an insulated Thermcraft thermal chamber. The specimen was heated by raising the temperature of the air inside the chamber, and cooling was performed by passing cold liquid nitrogen through the chamber. The major thermal cycling responses were obtained at six stress levels (43, 93, 144, 192, 294, and 393 MPa) with heating and cooling between 80°C and 200°C.
Following the experimental procedure, the actuation responses in the selected SMA were obtained, which represent typical behavior for all SMAs. Figure 3(a) shows the experimental responses of major cycles at different stress levels. The noises in the experimental responses were removed using a moving average filter. In Figure 3(b) and (c), the upper and lower minor cyclic responses at stress level 395 MPa is shown, where each minor cycle response has five repetitions. Figure 3(b) shows the upper minor cycle responses obtained with temperature variation of 75°C from the reference phase of Martensite at 100°C, and Figure 3(c) shows the lower minor cycle responses obtained with temperature variation of 46°C from the reference phase of Austenite at 200°C.

Experimental actuation responses in Ni50.3Ti29.7Hf20 SMA: (a) shows major actuation cycles at different stresses, (b) shows upper minor cycles, and (c) shows lower minor cycle responses at 395 MPa on the corresponding major cycle. The Cycle 1–5 represent the five repetitions of the minor cycle in order.
An increase in strain is observed in the subsequent upper minor cycles compared to the first minor cycle. However, in the case of the lower minor cycles, a similar shifting of the strain is minimal. This major difference between upper minor cycles and lower minor cycles could be due to the difference in the microstructures at these two locations. In the upper minor cycle, harder austenite phases are nucleated in a softer martensite matrix phase, which can induce larger strains in the matrix from stress concentrations. In the lower minor cycles, softer martensite phases are nucleated in a harder austenite matrix, resulting in smaller strains. Because of this difference, the upper minor cycles are sensitive to localized high straining compared to the lower minor cycles. As a result, the response in subsequent upper minor cycles are significantly shifted compared to the initial response.
4. Machine learning framework with physical constraints
Machine learning methods can be classified as supervised learning or unsupervised learning. In supervised learning, the task is to construct a function that maps the input and target pairs provided from a training. In unsupervised learning, the targets are not identified a priori, and the model should self-discover the naturally occurring patterns in the training data. A common example would be the clustering of data, where the algorithm groups training data into categories based on some similarity. For the modeling of SMA actuation, the target is a model that fits the available experimental behavior and can also extend predictions for new loading conditions. In this scenario, supervised learning methods are used with a set of training data. The neural networks allow control of the complexity of a model by varying the number of layers and nodes within it, which allows the tuning of a model that works best for SMA responses.
4.1. Description of the machine learning model for thermal actuation
The thermal actuation response of SMA at a fixed stress
Assuming a steady-state loading condition, the effects of the loading rate can be neglected and the change in strain
An ML model is formulated to fit the slope (

Machine learning framework for predicting the SMA response showing input, target, training, and prediction.
Input and output in the ML model for capturing the actuation response.
At any given point along the
4.2. Physical constraints from transformation surfaces
To model the partial transformation responses, additional physics based constraints are needed to include the physics of transformation surfaces. In the SMA actuation, the partial transformation branches initiate at the reversal points of thermal loading in the major cycles where complete phase transformation is not achieved. At these reversal points, the SMA is unloading away from the transformation surface, and during that step only thermoelasticity is operating. Additional constraints are needed in the
The hysteresis response in the thermal cycling of the SMA occur due to the different strain rates

Figure describing extra constraints to improve minor cycle response predictions.
4.3. Implementation of
ML using Neural Network
The implementation of the presented ML framework involves: (1) processing experimental data and normalization of the data, (2) training the NN model

Neural network architecture used for modeling actuation responses.
4.3.1. Noise removal and normalization of data
The noises in the experimental response may reduce the accuracy of the neural network model. To remove noise from the experimental data, the actuation responses are processed using the “smoothdata” function in MATLAB (2019) software with a Gaussian weighted moving average filter. The filtering is done separately for heating and cooling portions of the response to get accurate average values at the end points of cooling. In addition, the data is sampled at temperature steps of 1°C to remove additional irregularities. With the refined experimental response, the training data is generated.
Normalization of the input and target data is used in machine learning models to make the data on a similar scale. In the current approach, in addition to the conventional linear normalization, a nonlinear transformation with an exponential function is carried out. First, the strain rate
where,
The normalization parameters for the ML model.
The neural network model
In the case of
The algorithm for generating training data is summarized in Algorithm 1, where the input and target data are generated. Within each major cycle, for the stress, temperature, strain, and temperature change, the target derivative is estimated and stored. In addition, the data corresponding to the constraints in Section 4.2 are also stored.
4.3.2. Training and prediction
The
For a given thermal loading path, the responses are predicted using the trained

The strain at 195°C versus applied stress for the Ni50.3Ti29.7Hf20 SMA. A linear fit captures the variation of the strain versus the applied stress at 195°C.
A comparison of the responses from neural network model at different stress levels is shown in Figure 8. The predictions follows the experiments very closely for all the six stress levels used for training. Although the presented neural network model

Experimental major cycle responses from the Ni50.3Ti29.7Hf20 SMA reproduced using
5. Results and discussion
Partial phase transformation responses in the Ni50.3Ti29.7Hf20 SMA are simulated using
5.1. Prediction of minor cycles without applying constraints
First the

Predictions of partial cycles in the Ni50.3Ti29.7Hf20 SMA from
The particular predictions of partial transformation in Figure 9 can be improved by including also the experimental minor cycle responses from Section 3. However, the challenge is that the partial transformation has different responses while varying the combinations of endpoints of heating and cooling as described in Section 2. Although adding the particular experiments in Figure 9 can improve the prediction in the same path, it is not sufficient to map the whole combinations of partial transformation loops. To completely map the partial transformation region, one will have to do many experiments involving all combinations of partial transformations which could be expensive. Instead, in this work, the improvements in predicting partial phase transformation with the use of simple constraints are explored.
5.2. Prediction of partial cycles applying constraints
The predictions of partial phase transformation responses with

Predictions of minor cycle responses using the
5.3. Ensemble response and uncertainty quantification
Each training of the neural network model

Prediction of minor cycle responses in Ni50.3Ti29.7Hf20 SMA at 395 MPa: (a) upper minor cycle and (b) lower minor cycle. The Confidence Interval (CI) showing the uncertainty due to the variability in the training of
5.4. Simulating complex thermal paths
A specific training of

Additional actuation responses in Ni50.3Ti29.7Hf20 SMA calculated using the
Inner partial cyclic responses with partial heating and partial cooling are simulated for 300 MPa. The cycles are repeated seven times to simulate the evolution in multiple cycles. Cycles with initiation during cooling (Figure 13) and initiation during heating (Figure 14) in the major cycle are shown. The evolution of the inner cycles at three different sizes are simulated varying the size of temperature range. The inner cycles evolve significantly at the initial repetitions, and the difference gets reduced over succeeding cycles and reaches a steady cyclic response.

Validation for inner minor cycles with initiation during cooling in Ni50.3Ti29.7Hf20 SMA at 300 MPa. The corresponding temperature history is shown on top of each response. Thermal hysteresis between (a) 135°C–170°C, (b) 145°C–170°C, and (c) 155°C–170°C are studied. Presented ML model can predict complex thermal responses involving lower inner minor cycling.

Validation for inner minor cycles with initiation during heating in Ni50.3Ti29.7Hf20 SMA at 300 MPa. Corresponding temperature history is shown on top of each response. Thermal hysteresis between (a) 170°C–140°C, (b) 165°C–135°C, and(c) 160°C–140°C are studied. Presented ML model can predict complex thermal responses involving upper inner minor cycling.
The inner cycles show different evolution depending on whether its initiation is during cooling (Figure 13) or during heating (Figure 14). In Figure 13, where the initiation is during cooling, the strain value shifts higher in the successive cycles. This upward shift is consistent with experimental observations on evolution of inner cycle initiated during cooling reported in Amengual et al. (1996). In Figure 14, where the initiation is during heating, the strain shifts to lower value in the successive cycles. In the current model, the corresponding evolution of microstructure and transformation surface during partial phase transformation are not modeled and therefore cannot reason well this change of directions. When compared with the experiments of minor cycles (Section 3), where only either partial heating or partial cooling is considered, the evolution is only seen to increase with successive cycles. Therefore, the higher magnitude and different direction of shift in these predictions could be an artifact in the
The current analysis showed that the
6. Summary
In this work, a data-based and physics-constrained neural network framework is proposed to predict the experimentally observed nonlinear actuation responses of Shape Memory Alloys (SMAs). The framework takes four state variables as input to describe the rate of thermal actuation, which are current stress, current strain, current temperature and thermal loading direction. Further, we proposed physics-based constraints in the ML framework to capture complex minor cycle responses in the SMA actuation resulting from partial phase transformation. With the introduction of constraints, the framework captured the upper and lower minor cycle responses close to the experiments, and their effectiveness is discussed.
Actuation responses in a high-temperature NiTiHf SMA is considered for modeling and testing predictions. The neural network model is targeted to fit the nonlinear variation of strain rate in the SMA response during thermally induced transformation. The strain rate at the reversal points of the minor loop are specified as potential physical constraints. These constraints captured the activation of thermoelasticity while changing the loading direction away from the transformation surfaces. The constrained model is then trained using the data of major actuation cycles at different stress level. The following conclusions are made based on the model’s capabilities.
As expected, the
The model with physical constraints gets capabilities to predict beyond the training responses. The resulted model can predict accurately the major and minor cycles even though the training uses only the major cycle experimental data.
An uncertainty quantification on the predictions is performed using ensemble training of the model. The average response of the ensemble matched the experiments, and the confidence interval for the predictions are estimated from the ensemble variations.
The resulting model is providing an efficient computational platform to simulate the phase transformation behaviors under the complex thermal loading paths.
Two perspectives for the extension of the presented framework are: (i) accounting phase transformation behaviors under coupled thermomechanical loading and (ii) accounting the effects of plastic deformations such as Transformation Induced Plasticity (TRIP) for cyclic loading. For the first perspective on coupled thermomechancial loading, the current approach must be modified accounting the stress driven phase transformation with additional physical constraints of thermoelasticity at the initiation of partial transformation during the mechanical loading. For the second perspective, additional internal state variables and physical constraints accounting for the microstructural changes during phase transformation will be required.
In the realm of physics-based modeling, identifying the material parameters through calibration with experimental data is crucial, whereas in the ML approaches, the governing relationships are learned directly from the data. However, an inherent drawback of the ML approaches is their inability to predict beyond the training data. This can be resolved by introducing physics-based constraints which can improve the predictive capability in the ML approaches.
Footnotes
Appendix 1. Weights and biases in the F ML neural network model
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Authors acknowledge the financial support provided by the NASA Aeronautics’ University Leadership Initiative (ULI) project entitled “Adaptive Aerostructures for Revolutionary Civil Supersonic Transportation” (Grant No. NNX17AJ96A).
Data availability statement
Data sharing not applicable to this article as no datasets were generated or analyzed during the current study.
