Abstract
In image guided radiotherapy, in order to reach a prescribed uniform dose in dynamic tumors at thorax region while minimizing the amount of additional dose received by the surrounding healthy tissues, tumor motion must be tracked in real-time. Several correlation models have been proposed in recent years to provide tumor position information as a function of time in radiotherapy with external surrogates. However, developing an accurate correlation model is still a challenge. In this study, we proposed an adaptive neuro-fuzzy based correlation model that employs several data clustering algorithms for antecedent parameters construction to avoid over-fitting and to achieve an appropriate performance in tumor motion tracking compared with the conventional models. To begin, a comparative assessment is done between seven nuero-fuzzy correlation models each constructed using a unique data clustering algorithm. Then, each of the constructed models are combined within an adaptive sevenfold synthetic model since our tumor motion database has high degrees of variability and that each model has its intrinsic properties at motion tracking. In the proposed sevenfold synthetic model, best model is selected adaptively at pre-treatment. The model also updates the steps for each patient using an automatic model selectivity subroutine. We tested the efficacy of the proposed synthetic model on twenty patients (divided equally into two control and worst groups) treated with CyberKnife synchrony system. Compared to Cyberknife model, the proposed synthetic model resulted in 61.2% and 49.3% reduction in tumor tracking error in worst and control group, respectively. These results suggest that the proposed model selection program in our synthetic neuro-fuzzy model can significantly reduce tumor tracking errors. Numerical assessments confirmed that the proposed synthetic model is able to track tumor motion in real time with high accuracy during treatment.
Introduction
The goal of successful radiotherapy is to increase the clinical benefits by killing as many cancerous cells as possible while minimizing the adverse effects on the normal tissues surrounding the tumor. For this aim, an appropriate targeting accuracy delivers the prescribed 3-dimensional (3D) uniform dose onto tumor volume and at the same time avoids additional dose received by normal tissues. 1 However, tumor localization during an accurate treatment planning process is a significant challenge when a tumor located at thorax region of the patient's body moves mainly due to breathing phenomena. 2 –4 A semi-irregular respiratory motion causes rotation, translation, roto-translation, and even deformation of tumors. Therefore, some under- (over-) dosage might happen in the tumor volume while surrounding normal tissues are not safe against additional dose during treatment. 5,6 As a remediation, it might be necessary to depict a larger margin around the tumor including tumor volume and its moving site known as internal target volume that causes receiving high dose by normal tissues with serious side effects.
Several efforts have been made to compensate for the effects of intrafractional motion error on prescribed dose distribution during the irradiation session, including breath holding, respiratory gating, and real-time tumor tracking. 7 –12 In breath holding technique, the goal is immobilization of the tumor by asking the patient to prepare a reproducible deep inspiration breath-hold level. Breath holding technique requires the patient’s cooperation and therefore its implementation is limited for patients with breathing impairment. 13,14 As an alternative approach, respiratory-gated radiotherapy was proposed to save normal surrounding tissue against high dose by irradiating the therapeutic beam only in a predefined phase of the breathing cycle, mainly at the end of expiration. 15 –17 In the third proposed strategy, known as real-time tumor tracking radiotherapy, the tumor volume is continuously irradiated by therapeutic beam while the beam position is changing along with tumor motion in a synchronized fashion. In this method, a proper interplay between continuous reposition of the beam and target motion may remarkably minimize the amount of additional dose received by the normal nearby tissue. 18
The 2 latter strategies have been proven effective for targeting dynamic tumors and further increasing the treatment quality by compensating intrafractional motion error. 19 Both of these techniques strongly depend on obtaining accurate tumor position information in real time as a main component over a course of treatment. 20 –22
Several in-room motion monitoring strategies have been proposed for acquiring tumor position ranging from continuous X-ray imaging from tumor site to the use of external surrogates. 21,22,23,24 The selection criterion of each strategy depends highly on the feasibility, accuracy, and safety parameters for in-room motion monitoring. For some tumors with proper contrast, fluoroscopy method as continuous X-ray imaging system can precisely trace tumor position information in real time. Although the tumor is not observable, the fiducial is implanted inside or near the tumor volume as an internal indicator, representing tumor position in fluoroscopy images. However, additional imaging doses received by patients in this method may increase the side effects that must be limited according to the as low as reasonably achievable principle. 12,25 To overcome this problem, a number of motion prediction methods were proposed to estimate the tumor position information based on correlation between the motion of external surrogates—located on the patient body surface—and the internal motion of the tumor. 20,26 –29 This way, tumor motion is tracked reasonably with minimum imaging dose. However, compared with direct continuous X-ray imaging, an uncertainty error is introduced during motion prediction inherent to correlation models. Various deterministic and nondeterministic correlation models have been developed to minimize this uncertainty error during tumor tracking at radiotherapy with external surrogates that made it very promising for clinical applications. Among these models, a number of available external–internal correlation models were comprehensively described in our recent comparative study. 1 This solution has been implemented in Cyberknife for real-time tumor motion monitoring in the Synchrony respiratory tracking module. The system provides tumor tracking by means of a model that correlates external infrared markers placed on patient thorax skin and implanted clips inside or near the tumor volume detected by stereoscopic X-ray imaging system. 30 ,31 In this strategy, the correlation model must be first configured using synchronized external–internal motion data known as training data set at pretreatment step. After configuration, the correlation model is ready to trace tumor motion in real time using only external markers motion data during treatment. Furthermore, the model performance is checked as needed during treatment using new arrival external–internal motion data in synchronized form, obtained by corresponding monitoring devices that are optical tracking system (external motion data) and intermittent X-ray imaging (internal motion data).
Apart from the above-mentioned strategies in tumor motion monitoring, it is worth mentioning some ideas presented in different literature for motion track of tumors without ionizing radiation. 32,33 A miniature, implantable radiofrequency coil has been developed 34 that can be tracked magnetically 3 dimensionally from outside the patient body. A new technology based on nonionizing electromagnetic fields, using small wireless transponders implanted in human tissue, 35 has been developed at Calypso Medical Technologies, Inc, (Seattle, Washington). The Calypso System uses GPS for the body technology to safely guide radiation delivery during prostate cancer treatment. One or more wireless transponders (Beacon) are permanently placed in a patient’s prostate or prostatic bed. Each transponder emits a unique radio frequency signal to the Calypso System, which then determines the exact location and motion information about the tumor target. Moreover, real-time 3D ultrasound is another noninvasive alternative to avoid using X-ray imaging for tracking the tumor motion. 36,37 Ultrasound imaging is a method of obtaining images from inside the body using high-frequency sound waves. These sound waves are reflected and recorded as real-time images. This means they can show movement of internal tissues and organs without involving X-ray. Ultrasound images have advantages in visualizing certain soft tissues unlike X-ray imagers which can only visualize bony structures or implanted metal markers such as the liver. It should be noted that these strategies are not conventionally available in comparison with ionizing methods due to their implementing (Calypso) and also limitations at extracting precise and reliable motion information (Ultrasound).
In our recent literature, 38 3 fuzzy logic based correlation models using different clustering algorithms were assessed, comparatively. As concluded in the current work, some drawbacks of the previous study 38 may still be investigable for obtaining better performance of the correlation model using an optimized version of adaptive neuro-fuzzy inference system (ANFIS). Since most conventional linear systems do not usually have desired performance due to several reasons such as data imperfection, high degrees of variability in the database, linguistic knowledge about a system, or time-varying characteristics that make a difficult condition, neuro-fuzzy–based models may be valuable tools to give a proper performance. 39 Neuro-fuzzy–based systems combine the abilities of the human-like reasoning style of fuzzy systems with the learning process of adaptively neural networks system. 40,41
In this study, a synthetic ANFIS is proposed for tumor motion tracking to address challenging issues that remained from our former study by (1) modifying ANFIS structure in its antecedent-consequent parts, (2) testing additional available data clustering algorithms, (3) improving uncertainty error reduction in tumor motion tracking, and (5) evaluating model run-time quantitatively as an important clinical issue. Our proposed synthetic ANFIS is a combination of several modified ANFIS models that are different at their data clustering algorithms. Our synthetic ANFIS model consists of an adaptive model selectivity subroutine that is responsible to choose the best ANFIS model during treatment. In contrast, simple nonsynthetic (or conventional) ANFIS is based on one specific clustering method without any flexible option to adapt itself with each treatment situation.
In the proposed modified model, in order to avoid the overparameterization issue that generally happens for identification of nonlinear dynamic systems, a fusion of least square and constraint estimation methods is proposed to attain accurate consequent parameters required at ANIFS structure configuring. 41,42
In ANIFS, one of the main key components of fuzzy systems structure is data clustering that undertakes the generation of membership functions. These functions play an important role in the accuracy of model output. 41 Conceptually, data clustering partitions the data set into several groups with similar properties in order to create homogeneous clusters or groups for better data analysis. 43,44
Seven ANFIS correlation models configured by 7 data clustering algorithms that have intrinsically different behavior in grouping a great variety of databases were tested and compared, separately. Since one of our aim was to configure an optimized correlation model (ie, synthetic ANFIS), a model selectivity option was developed to choose the best ANIFS, made by a specific clustering algorithm in an automatic fashion. Therefore, our 7-fold synthetic ANFIS model includes all available 7 models plus model selectivity subroutine to choose the best model adaptively on a case by case basis. Moreover, in a treatment session of a given patient our synthetic ANIFS is highly flexible to adapt itself to patient respiratory patterns at each updating process while the model is rebuilt.
The evaluated clustering algorithms that are highly applicable and available are as follows:
(1) Kmeans, (2) Kmedoid, (3) subtractive, (4) fuzzy C-means (FCM), (5) Gustafson-Kessel (GK), (6) Gath-Geva (GG), and (7) expectation maximization (EM). It should be noted that the number of data clusters is variable during model configuration. Xie-Beni index (XB) was used as the validation criterion for finding the optimal cluster number. 45
In clinical application of each correlation model, tumor motion tracking must be done promptly at real-time mode even less than system latency. It is noteworthy that the execution time of the proposed 7-fold synthetic ANFIS model may increase because of computational complexities associated with implementing 7 ANFIS with different clustering algorithms at the same time. To investigate this issue, a quantitative assessment was performed on 2 truncated versions of the synthetic ANFIS model to find a compromise between model performance accuracy and its computational time using an in-room computer system. Two truncated synthetic ANFISs known as double (2-fold) and triple (3-fold) ANFIS models that use 2 and 3 predominant data clustering algorithms were used for comparison.
The results suggest that the proposed synthetic ANFIS is able to predict tumor motion remarkably better than conventional ANFIS models and addresses all remaining concerns. The properties of 7 modified ANFIS models developed separately were compared with three 7-fold, triple and double synthetic ANFIS models and also with Cyberknife modeler over 20 real patients treated with Cyberknife Synchrony system. Since each patient has its unique breathing cycle motion with possible variability during treatments, the best correlation model can be selected adaptively by 3 proposed synthetic ANFIS models in the pretreatment step and also during model update to adaptively trace tumor motion in real time on a case by case basis. Numerical analysis on computational time of 3 synthetic ANFIS models demonstrated that 7-fold synthetic ANFIS is able to extract tumor position information in real time with the highest accuracy. Double and triple synthetic ANFIS models showed a trade-off between model accuracy and run time which is challenging for clinical application.
Methods
CyberKnife System and Patient Database
The Cyberknife system (AccurayInc, Sunnyvale, California) is composed of a radiation delivery device, called a linear accelerator (or Linac), which is mounted on a robotic arm. 47 The flexibility of the robotic arm enables the Cyberknife system to deliver the prescribed dose onto a tumor volume anywhere around the patient body including brain, head and neck, spine, lung, prostate, liver, pancreas, breast, and other soft tissues. The Cyberknife system also utilizes sophisticated software and advanced X-ray imaging to track tumor and patient movement and adjust the therapeutic beam to ensure the planned dose is delivered with a high degree of accuracy. In radiotherapy with the Cyberknife system, the tumor position is identified by means of a correlation model generated from the synchronized internal–external data set extracted, respectively, from internal implanted fiducial markers and the external infrared markers. Internal motion signals demonstrating tumor position are gathered by detecting implemented fiducial markers in close proximity to, or inside, the tumor in a minimally invasive procedure using orthogonal X-ray imaging system. The external motion signals are gathered by locating infrared markers positioned on the patient’s chest and/or abdomen region and detected by Optical Tracking System. 47
Within the available database (86 patients treated with 339 treatment fractions), 2 groups including control and worst patients were randomly extracted without any consideration to tumor type or its site. Control group represents the patients whose treatment was carried out easily during tumor motion tracking. In contrast, the worst cases have the largest average tracking error of moving tumors. The internal–external database was stored in the log file. In our data set, treatment process time ranged from 27 to 118.8 minutes for 20 patients utilized in this study. Also the range of external markers motion computing as peak to peak in 3D, varied from 1.4 to 6 mm in the control group and 8.6 to 96.9 mm in the worst group. Moreover, tumor motion range was from 2.55 to 18.31 mm in the control group and 10.91 to 62.49 mm in the worst group. Stereoscopic X-ray imaging as verification points varied from 68 to 145 points, taken totally at pretreatment and during treatment for model update. The average of tumor motion at x, y, and z were 28.5, 14.7, and 14.1 mm, respectively, over total patients.
The correlation model was implemented in MATLAB (The MathWorks Inc, Natick, Massachusetts) using the fuzzy logic, fuzzy clustering, and ANFIS toolboxes.
Development of ANFIS
The theory of fuzzy logic provides a mathematical strength to capture the nonstatistical uncertainties associated with human cognitive processes, such as thinking and reasoning. 48 –50
Fuzzy inference (reasoning) is the actual process of mapping from a given input to an output using fuzzy logic. 51 –53 The process includes membership functions generation, fuzzy logic AND/OR operators implementation, and if-then rules extraction. There are 2 common types of fuzzy inference systems (FISs) that can be implemented in the Fuzzy Logic Toolbox: Mamdani-type and Sugeno-type. These 2 types of inference systems vary somewhat in the way outputs are determined. 54
A fuzzy neural network or neuro-fuzzy system is a learning machine that tunes the parameters of a fuzzy system (ie, fuzzy sets and fuzzy rules) by exploiting approximation techniques from neural networks. 39 –41,55 A neuro-fuzzy approach as a combination of neural networks and fuzzy logic has been introduced to overcome the individual weaknesses and to offer more appealing features.
Neural networks can only take into account if the problem is expressed by adequate amounts of observed examples that are used to train the black box. On the contrary, a fuzzy system demands linguistic rules instead of learning examples as prior knowledge. In the case of incomplete, faulty or inconsistent attained knowledge, fuzzy system must be adjusted in a heuristic way. A suitable method for fuzzy system adaptation is an automatic adaptation procedure like neural networks.
In the present study, a class of adaptive networks that act as a fundamental framework for ANFIS were employed. The process of developing a FIS using the framework of adaptive neural networks is called an ANFIS. 39,40
ANFIS architecture
ANFIS is a kind of neural network based on Takagi–Sugeno FIS. Considering 2 rules of FIS with 2 x and y as inputs and z as output:
The “if” part is the antecedent, the “then” part is the consequent, x and y are linguistic variables and as it is depicted in Figure 1, A1, A2, B1, and B2 are corresponding fuzzy sets, and p1, q1, r1 and p2, q2, r2 are linear parameters.

Adaptive neuro-fuzzy inference system (ANFIS) architecture, X and Y are linguistic variables and A1, A2, B1, and B2 are corresponding fuzzy sets, and p1, q1, and r1 besides p2, q2, and r2 are linear parameters.
The circular nodes represent fixed nodes and the square nodes are variables. The ANFIS architecture consists of 5 layers and the functionality of each layer is as follows:
Layer 1:
It is the input layer. Every node i in this layer is an adaptive node with a node function:
Which O1,i is the membership degree of a fuzzy set (A1, A2, B1, and B2). In this study, Gaussian membership functions are assumed to represent the fuzzy sets A
i,k
(xk):
Where v and σ are antecedent or premise parameters responsible for membership functions generation. These parameters are adjustable by gradient-descent learning algorithms, such as back propagation.
Layer 2:
Every node in this layer is a fixed node. In this layer the AND operator is applied to obtain fire strength representing satisfied degree of antecedent part of a given rule. The output of this layer is calculated as follows:
Which α
i
represents prior probability of the cluster i:
Layer 3:
Layer 3 includes fixed nodes which calculate the normalized firing strengths of the rules as follows:
Layer 4:
This layer is known as defuzzification layer. Each neuron in this layer is connected to the respective normalisation neuron and also receives initial inputs, x1 and x2. A defuzzification neuron calculates the weighted consequent value of a given rule as:
The parameters of this layer (pi, qi, ri) are referred to as the consequent parameters.
Layer 5:
There is a single node here that computes the overall output of the ANFIS in this layer as follows:
It should be noted that in ANFIS, hybrid learning techniques are used to optimize premise parameters associated with membership functions and consequent parameters represent the overall output of the system.
Hybrid learning techniques
The hybrid learning algorithm proposed by Jang, Sun, and Mizutani 41 uses a fusion of Steepest Descent and least squares estimation.
Consequent Parameter Estimation
The ANFIS models, especially the first-order one, for most identification systems tend toward overparameterization issues and this causes overfitting and deviation from the correct learning course.
In this study, the basic least squares criterion was mixed with constrained estimation method to resolve this issue. It may be fixed by incorporating prior knowledge into the correlation model.
Constrained estimation
Prior knowledge about the dynamic system process such as its stability, minimal, or maximal static gain can be translated into equality and inequality constraints on the consequent parameters of the correlation model. The fuzzy model is achieved from input–output data set by quadratic programming, instead of least squares. To assess this issue, assume that Takagi-Sugeno(TS) model as following:
Which in quasi-linear system is stated:
Where a(x) and b(x) are input-dependent parameters and convex linear combinations of the single consequent parameters ai and bi, that is:
By this characteristic, global convex constraints can be defined for the whole model.
Construction and initialization of membership functions
Proper initialization of membership functions by application of gradient-descent learning is crucial to reach appropriate performance. There are several approaches for this goal such as template-based membership functions, discrete search methods, and fuzzy clustering algorithm. In this study, fuzzy clustering method was assumed to be used.
Model sketch based on fuzzy clustering comes from data analysis and pattern recognition, where the degree of a given data similarity to some primordial objects is explanatory of the membership idea, the similarity degree of which can be estimated by using an appropriate distance measure. Based on the degree of similarity calculated by different clustering methods, data sets are clustered such that the data point within a cluster area introduces as similar as possible, and the data from different cluster areas are as dissimilar as possible.
The antecedent membership functions are inferred by planning the clusters onto the single variables. Cluster validity measures and merging techniques were applied to define the number of clusters in ANFIS structure design.
Modified Synthetic ANFIS
As demonstrated, data clustering has an important role in membership function generation and hence model performance accuracy. Since motion database of each patient respiration pattern is highly variable and imperfect for some cases, different data clustering algorithms may result in different model performance accuracy due to inherent properties of each clustering strategy. Considering a range of data set for instance at pretreatment step, while the amount of data set is few, the operation of hard clustering methods such as subtractive and Kmedoid may be better than the fuzzy logic-based techniques such as EM and GG. Inversely, through the treatment process, as the amount of data increases, the fuzzy clustering methods generate more reliable membership functions to design robust neuro-FIS. According to these defects, a synthetic ANFIS model including neuro-FISs with different data clustering algorithms might be an optimal correlation model to trace the tumor adaptively over the treatment course of a given patient with irregular motions and even for patients with various respiratory patterns. The aim is to earn the robustness of implemented clustering methods as much as possible for each patient on a case by case basis by means of synthetic ANFIS model.
Figure 2 depicts the proposed correlation model illustrating model configuration at pretreatment and update steps and model performance during treatment.

Seven-fold synthetic adaptive neuro-fuzzy inference system (ANFIS) flowchart.
As seen, synthetic ANFIS includes 7 data clustering algorithms and the best model with most adapted clustering algorithm dedicated for a patient is chosen to track tumor motion as a function of time.
The model is updated over the course of treatment relying on periodic imaging to refresh the correlation parameters as needed. In our patient database, model update is limited by the time interval between subsequent X-ray imaging acquisitions, ranging from 41.0 to 97.6 seconds. Although the total treatment time increases for a patient, it expects a significant intrasession variability.
It should be considered that paired external–internal data points gathered at each update step are accumulated to training data set taken at initial part of treatment course to reconstruct correlation model. This process is continued until the number of training data set plus initial updating data points reaches m = n*(number of training data set at pretreatment step), which n is a constant equal to 4 proven as optimal ratio by trial and error and m is a value representing the number of training data set used for model update. During update while the model must be rebuilt, tumor motion tracking is missing for a few seconds while data sampling rate is high. To address this issue, tumor position is predicted using the former model built at the previous step of the updating process. When the model is updated using a new data set, it is ready to replace with the former model.
As seen in Figure 2, for model construction the data set is divided into 2 subsections as new training data set (75%) and test data set (25%). The latter case is utilized for checking model performance and also selecting the best model. The test data set accounts for avoiding ANFIS performance against overtraining and also selecting an appropriate model configured by a specific clustering algorithm. Although training and test data set are chosen randomly to have a comprehend patterns of respiratory motion, it should be considered that the last synchronized data points must belong to the test data set as it determines model performance for the most recent respiratory pattern.
At synthetic ANFIS model strategy, for selecting and configuring the best candidate of correlation model, some concerns may be raised due to computational complexities using in-room computing systems. In clinical application of these models, there should be a compromise between the accuracy of model output and its computational complexity. In this work, a quantitative investigation was done on model execution time and its performance accuracy during treatment sessions. Numerical assessments on run time of 7-fold synthetic ANFIS show that this model is able to track tumor motion in real time with the most accurate performance during treatment. But in order to further investigate model computational time to reach a faster model with reasonable performance accuracy at the same level of 7-fold synthetic ANFIS, double and triple synthetic models were proposed.
To do this, the applied clustering algorithms implemented in 7-fold synthetic ANFIS were reduced to 2 and 3 predominant clustering algorithms. The alternative truncated models were represented as double and triple synthetic ANFIS, respectively. When these models were designed, their high performance accuracy was kept while their computational complexities were remarkably reduced in comparison with 7-fold synthetic ANFIS.
In conclusion, 3 proposed synthetic ANFIS models (7-fold, dual and triple synthetic models) used a combination of several clustering algorithms to establish optimized fuzzy rules and membership function. These models were suggested because of the varied range of our semi-irregular database with high degrees of uncertainty particularly in worst patients. Obviously, a nonsynthetic or conventional ANFIS model using the same database may not be highly accurate.
Data Clustering Algorithms
Data clustering algorithms in the construction of FIS are crucial as they determine the antecedent parameters of ANFIS model. Data clustering algorithms performance changes while the distribution and deviation of data differ. Hence, in this work, a combination of clustering approaches was implemented to construct a modified and flexible adaptive neuro-fuzzy system with high efficiency. The properties and the implementation of the model and clustering algorithms are detailed as follows.
Clustering validity
Cluster validity refers to the investigation of a given fuzzy partition fitness. The clustering method attempts to partition data with a suitable adjustment for fixed cluster numbers. Although the clustering method might work appropriately, the incorrect number of clusters destroys its validation. The proper cluster number is determined by computing the given partitions validation to appraise the quality of the obtained partitions for different values of c. The applied approach in this work for defining the cluster validity is XB.
The XB index aims to determine the proportion of the entire variation within clusters and the separation between clusters.
The optimum number of clusters belongs to a partition with the minimum value of XB index.
As mentioned, 7 data clustering algorithms based on hard and soft partitioning were utilized in this work to investigate the impact of each cluster on the model performance accuracy at tumor motion tracking. These algorithms are briefly reviewed in this section.
Kmeans Clustering
The Kmeans clustering, or hard C-means clustering, is an algorithm based on defining clusters in a data set such that the cost function of dissimilarity criterion is minimized. The Euclidean distance is mostly used as the dissimilarity measure. 56
Kmedoid Clustering
In some applications, we need to define centers among data points. In order to do this, the Kmedoid is used in a similar approach to Kmeans except when specifying the centers v1, v2, … , vc. The algorithm is restricted to the data points themselves. Kmedoid is more robust than Kmeans in the presence of noise and outliers, however it is more costly in terms of processing time. 57
Subtractive Clustering
Subtractive algorithm is based on the density measure of the data points, and its goal is to detect areas which have high densities of points. Density criteria represents data concentration on a specific distribution. The cluster centers are given to the points with the highest number of neighbors based on density measurement. 58
Fuzzy C-Means Clustering
FCM is a method of clustering in which each data point belongs to several clusters with the degree of belongingness designated by membership grades. 61 The values of membership matrix U elements are between 0 and 1. However, the summation of membership degrees for a data point is always equal to unity taking into account all clusters.
Gustafson-Kessel Clustering
The GK algorithm is a mighty fuzzy clustering technique based on Mahalanobis distance that is applicable in various fields such as data analysis and pattern recognition, image processing, and control. Its main trait is the local adaptation of the distance metric to the shape of the cluster by estimating the cluster covariance matrix and adapting the distance-inducing matrix correspondingly. 59
Gath-Geva Clustering
GG clustering algorithm is also known as Gaussian mixture decomposition, which is similar to the FCM method; the difference is the way the distance is calculated. The Gaussian distance is used instead of Euclidean distance and clusters have no definite shape with varying size. 60
Expectation Maximization Clustering
EM is one of the most common algorithms used for density estimation of data points in an unsupervised setting. The algorithm relies on finding the maximum likelihood estimates of parameters when the data model depends on certain latent variables. 61,62
Results
To determine the appropriate model for prediction of tumor motion, the constructed models are assessed by means of root mean square error (RMSE) as metric for error measurements. The RMSE (also called the root mean square deviation) is a frequently used method to measure the difference between predicted values by a model and the actual values which observed from the environment that is being modelled. The RMSE of a model prediction with respect to the estimated variable Xmodel is defined as the square root of the mean squared error:
Where Xmodel is the proposed model results at time i and Xobs is the real position of tumor observed by stereoscopic X-ray imaging system that ranges from 68 to 154 points.
Figure 3 shows the calculated RMSE of 7 conventional ANFIS models working by each data clustering algorithm, separately (in nonsynthetic mode) and modified synthetic ANFIS, against Cyberknife modeler over 20 patients at 2 control (Figure 3A) and worst groups (Figure 3B).

A, Root mean square error (RMSE) versus patient number in the control group. B, RMSE versus patient number in the worst group.
As seen in Figure 3, although the nonsynthetic (conventional) ANFIS models except ANFIS-Kmeans in the control group performed more accurate than Cyberknife, their improvements in error reduction were not remarkable. In contrast, the error was reduced to 1.1 mm in control cases and 4.1 mm in worst cases using synthetic ANFIS. Considering Cyberknife modeler, the same calculations are 2.1 mm for control group and 10.7 mm for worst group.
As mentioned before, to decline processing complexity and increasing the model performance speed, 2 curtailed techniques namely triple and double synthetic ANFIS models which were schemed based on predominant data clustering algorithms were also examined. The predominant ANFIS models were selected according to their participation for all data set in 7-fold synthetic ANFIS performance. Figure 4 represents the average participation percentage of each clustering strategy used in our synthetic ANFIS over 10 control and 10 worst cases, respectively. These results were utilized to determine the most effective and useful clustering methods on synthetic ANFIS execution as one of the challenges proposed in this study. As seen, the predominant algorithms among 7 applied methods are mainly EM, GG, and subtractive methods in data partitioning. Moreover, the predominant clustering methods implemented in the proposed models were listed in Table 1 for each patient. For double synthetic ANFIS, EM and subtractive algorithms were selected due to better performance of subtractive clustering than GG algorithm in a small range of data set.

A, Average participation percentage overall 10 patients of each clustering algorithm for synthetic adaptive neuro-fuzzy inference system (ANFIS) configuration in the control group. B, Average participation percentage overall 10 patients of each clustering algorithm for synthetic ANFIS configuration in the worst group.
The Predominant Clustering Model for Each Patient in Both Control and Worst Groups.
Abbreviations: EM, expectation maximization; GG, Gath Geva; FCM, fuzzy C-means; GK, Gustafson-Kessel.
The RMSE resulting from triple and double synthetic ANFIS models is reported in Table 2 for 10 patients of the control and worst groups, respectively. The results were compared with RMSE of 7-fold synthetic ANFIS against Cyberknife modeler.
Three-Dimensional RMSE of Cyberknife Modeler, 7-fold Synthetic ANFIS, Triple Synthetic ANFIS, and Double Synthetic ANFIS for Each Patient and Average RMSE for all Models in Both Control and Worst Groups.
Abbreviations: ANFIS, adaptive neuro-fuzzy inference system; RMSE, root mean square error.
As a straightforward comparison with our previous study, 39 the improvement percentage of current available models is represented in Table 3. As seen in this table, in our previous study the error of proposed model was 26.4% higher than Cyberknife in the control group; although, in the worst group it outperformed Cyberknife by 19.7% error reduction improvement. Considering the current study, the 7-fold synthetic ANFIS model operated more accurately compared to the benchmark with 49.3% and 61.2% error reduction in control and worst groups, respectively.
Improvement Percentage of Each Nonsynthetic ANFIS Models, 7-Fold Synthetic ANIS, Triple Synthetic ANFIS, and also Previous Study Double Synthetic ANFIS in Comparison With Cyberknife Modeler for Both Control and Worst groups.
Abbreviations: ANFIS, adaptive neuro-fuzzy inference system; EM, expectation maximization; GG, Gath Geva; GK, Gustafson-Kessel; FCM, fuzzy C-means.
Moreover, as a statistical analysis, paired Student t test was used for comparison; in this test, P value indicates whether or not there is any significance between models and effect size criteria shows the degree of the models performance differences. For interpreting different effect sizes, Jacob Cohen 63 outlined a number of criteria for gauging small, medium, and large effect sizes in different metrics as follows:
r effects: small ≥ .1, medium ≥ .3, large ≥ .5
Taking into account Student t test, the mean values and standard deviations (SDs) of our proposed synthetic ANFIS model versus Cyberknife are 2.3 mm (SD = 0.9) and 6.1 mm (SD = 2.1), respectively, for worst cases with P value <.0002. It should be noted that the effect size was 0.8 which indicates a substantially large effect. Considering control cases, the same calculations on synthetic ANFIS results versus Cyberknife are 0.6 mm (SD = 0.3) and 1.2 mm (SD = 0.7), respectively, with P value <.028; in this group the amount of effect size is 0.6. As concluded from statistical analysis, the 7-fold synthetic ANFIS model was more robust and accurate at tumor motion tracking.
As a visual assessment of model accomplishment, a short 3D fragment of the overall tumor traces during breathing for a typical patient of control group is demonstrated in Figure 5. The black star symbols show the actual location of tumor while red and blue lines represent estimated superior–inferior, left–right, and anterior–posterior positions of tumor (mm) by 7-fold synthetic ANFIS and Cyberknife modeler, respectively.

A fragment of tumor motion (mm) in the scale of treatment time (seconds); black star symbols, red and blue lines show the actual tumor position, estimated position of tumor (mm) by 7-fold synthetic adaptive neuro-fuzzy inference system (ANFIS) and Cyberknife model, respectively. A, Superior–inferior (SI) position, (B) left–right (LR) position, (C) anterior–posterior (AP) position.
As seen in Figure 5, the predicted tumor motion by Cyberknife modeler and 7-fold synthetic ANFIS are similar to each other; however the distance gap between both models and X-ray imaging data as the exact real position of tumor is the criterion to measure the preciseness of the models which according to this figure, the proposed model is more accurate than Cyberknife modeler.
Discussion
In this study, an adaptive neuro-fuzzy–based correlation model was proposed to infer tumor position information as a function of time using Cyberknife Synchrony system database. We investigated some drawbacks derived from our previous study 39 to introduce an optimized ANFIS model with more precise performance. As the first step, the consequent part of our ANFIS was reconstructed by combining the constraint estimation and the least square method to avoid overparameterization, whereas this combination was lacking in the previous version of our conventional ANFIS. 39 Seven available and applicable data clustering algorithms were assessed at partitioning the internal–external motion database for membership function generation as one the main parts of ANFIS structure. Since each cluster had its inherent properties at data grouping, the impact of each cluster on the ANFIS model performance was considered in a comparative manner. In the next step, a synthetic modified ANFIS was proposed to be constructed using all 7 clustering algorithms. The aim was to use the robustness of each clustering strategy at model operation in an adaptive mode automatically, since our database is highly variable due to breathing motion of different patients. Therefore, an automatic model selectivity option was added to the final modified synthetic ANFIS model to choose the best neuro-FIS configured by a proper clustering algorithm for each breathing motion pattern, adaptively. By this strategy, the model performance in tumor motion tracking could be highly improved against conventional correlation models. The 7-fold synthetic ANFIS model was first configured using training data set in pretreatment step; and 7 ANFIS models based on 7 data clustering methods were made in parallel. Then, model selectivity subroutine appoints the best ANFIS model made by specified clustering method for a given breathing motion pattern of the patient. The selection is done at a competitive condition among 7 models based on their performance accuracy calculated by RMSE metric tool. The synthetic model also automatically selects the best ANFIS at updating steps during the treatment of each patient, providing flexibility that was also lacking in our previous work.
As seen in Figure 3 and Table 2, for control group among nonsynthetic conventional ANFIS models, ANFIS-EM and ANFIS-GG outperformed other nonsynthetic ANFIS models in tumor tracking, while ANFIS-Kmedoid, ANFIS-subtractive, and ANFIS-GK improvements were trivial; however ANFIS-Kmeans functionality was inferior compared to Cyberknife modeler. In contrast, synthetic ANFIS targeting error was reduced significantly by 49.3% versus Cyberknife, while the same value in the previous study was 26.4%. In worst group including patients with largest targeting error, all nonsynthetic ANFIS models performed more accurately than the Cyberknife modeler. In this group, fuzzy clustering-based ANFIS models represented better estimations compared with hard clustering-based methods. Concerning 7-fold synthetic ANFIS, error reduction improved to around 61.2% in this study, while this value was 19.7% in previous study.
Among hard partitioning-based ANFIS models, ANFIS-subtractive and between fuzzy ones, ANFIS-EM, and ANFIS-GG were recognized as the superior models for 2 groups while ANFIS-EM was proved to be the predominant one with the maximum usage and minimum overfitting issue. In general, hard partitioning algorithms seem to be more accurate for data set with a small range and high-distribution density. Conversely, for a broad range of data sets, fuzzy partitioning algorithms performed better. Although the synthetic ANFIS based on 7 mentioned algorithms represents more accurate tracking, it encounters computational complexity using in-room computing systems. In this study, as a solution, a simple version of synthetic ANFIS model based on 3 predominant algorithms including ANFIS-EM, ANFIS-GG, and ANFIS-subtractive and also based on 2 predominant algorithms including ANFIS-EM and ANFIS-subtractive were suggested.
At the pretreatment step where each correlation model must be built by determining required parameters, time may not be considered a challenging factor because no real treatment is happening. For a typical case from the control group with 9 training data set, 7-fold, triple and double synthetic ANFIS models were configured in 4.3, 2.2, and 1.7 seconds, respectively. When the model was built, its performance for motion tracking during treatment in real-time using only external markers motion as input to estimate tumor position as output. For a typical case, data sampling of external markers motion is done at each 0.0385 seconds by the optical tracking system. In contrast, models performance time with different clustering algorithms chosen at model selectivity option range from 0.0031 to 0.0067 seconds during treatment that is less than required time for external data gathering. As numerically investigated and compared to double and triple synthetic models, 7-fold synthetic ANFIS model is able to track tumor motion in real time during treatment. Concerning model update process, it should be noted that during the treatment course, when new arrival data pair is received the oldest data pair is removed, automatically while the total number of training data set reaches to 4*(number of training data at pretreatment step). By this way, the correlation model is built using fresh data set including recent information of synchronized external–internal motion data. For our typical case, each update step took 3.9, 1.7, and 1.3 seconds for 7-fold, triple and double synthetic ANFIS models, respectively. This is a remarkable improvement in model configuration time during update. It should be taken into account that quantitative assessment on models’ run time was obtained on windows 64 bit 1.73 GHz Intel(R) Core (TM) processor, with 6GB RAM. In the control group, error reduction reached to 34.1% by triple synthetic model and 32.7% by double synthetic model. Considering worst cases, this value was 51.3% for the triple model and 47.0% for the double model in the same condition. Based on quantitative assessment results on the run time issue, it seems a proper compromise could be regarded between model output accuracy and its run time. However, the computational time issue can be addressed by means of a powerful hardware dedicated system.
Conclusion
In this study, we compared 7 ANFIS based correlation models with different data clustering to infer tumor motion the same database. This allowed us to have a quantitative comparison of their performance in tumor motion tracking. Models were selected at an increasing level of complexity based on their performance in the most recent respiratory pattern. Finally, we proposed an optimum synthetic ANFIS model including all 7 clustering methods at the same time to use their robustness and redress their defects. As illustrated from the analyzed results, a remarkable improvement was achieved by developing modified synthetic ANFIS which has prepared more adaptation condition for each breathing pattern, uniquely. Moreover, the structure of ANFIS model was improved to keep the model away from overparametrization and over-training issues, which provides an effective improvement in error reduction with respect to the current state of the art models and also our previous model. According to the quantitative assessment of the models’ run time, it has been proven that the proposed 7-fold synthetic ANFIS can predict tumor motion in real time, although 2 other simple versions of synthetic ANFIS models were proposed to track tumor motion faster with keeping performance accuracy.
Footnotes
Acknowledgments
The authors acknowledge Sonja Dieterich for providing access to the clinical database and also Hadi Hosseini that his comments improved the quality of this article.
Authors’ Note
All authors specify that this manuscript has not been published in whole or in part nor is it being considered for publication elsewhere.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
