Abstract
Objective
To determine if preoperative velopharyngeal closure percentage (VCP) is predictive of successful Furlow double opposing Z-plasty (DOZP) and subsequently determine the optimal velopharyngeal closure cutoff for successful DOZP.
Design
Retrospective study
Setting
Tertiary academic center
Patients
110 patients with repaired cleft lip and palate having hypernasality treated with DOZP
Interventions
Speech videofluoroscopy images were used to obtain the preoperative VCP and other measurements.
Main Outcome Measures
Changes in hypernasality scores using the Cleft Audit Protocol for Speech-Augmented-Americleft Modification (CAPS-A-AM) rating system were used as the primary outcome measure. A successful DOZP was defined as a postoperative hypernasality score of ≤ 1 or an improvement of 2 or more scores from baseline. A receiver operating characteristic (ROC) curve was calculated to determine preoperative VCP cutoff.
Results
There were 110 patients who underwent DOZP for treatment of velopharyngeal insufficiency. Of these patients, 94 (85%) had successful surgery as determined by their postoperative CAPS-A-AM hypernasality score. Preoperative VCP was a statistically significant predictor of successful DOZP (P < .0001). The ROC curve with Youden index (J) determined a cutoff (c*) of 55% preoperative VCP or greater to optimize surgical success rate. Grouping by preoperative VCP showed that surgical success increases directly with preoperative VCP, and patients with low VCP had above a 50% success rate in reducing hypernasality scores.
Conclusions
Preoperative VCP was significantly associated with improved hypernasality ratings postoperatively. A preoperative VCP of ≥55% may be used to help predict success of Furlow palatoplasty treatment. Patients with lower VCP can still benefit from secondary DOZP.
Introduction
Following primary palatoplasty, up to 30% of patients with cleft palate may have velopharyngeal insufficiency (VPI) requiring secondary surgical correction.1–4 VPI occurs when the velum fails to adequately close against the posterior nasopharynx, thus allowing airflow and sound energy to escape into the nose during speech. Dysfunction of this valve system results in hypernasality, audible nasal air emission, or both. Hypernasality is defined as excessive sound energy within the nasal cavity during the production of vowels and voiced oral consonants. 5 Audible nasal air emission is defined as a perceptible escape of air through the nose during the production of high-pressure consonants. 5 Furlow double opposing Z-plasty (DOZP), 6 is a commonly used surgical technique used to treat secondary VPI.7,8 The DOZP aims to reorient the levator veli palatini muscles transversely in the posterior palate and to lengthen the soft palate, achieving velopharyngeal competence with a lower risk of obstructive complications compared to pharyngeal flap or sphincter pharyngoplasty.3,9,10
DOZP has been shown to be effective at correcting hypernasality in many patients,7,11,12 but its efficacy is dependent on several factors. Previous studies have identified velopharyngeal gap size, palatal length, and velar length to be predictors of Furlow palatoplasty outcome.7,11 Other studies investigated dynamic speech factors such as velopharyngeal closure ratio during speech, velar movement, and lateral pharyngeal wall movement as indicators of DOZP outcomes.4,13,14 The observation that small preoperative velopharyngeal gap size and large velopharyngeal closure ratios tend to result in high rates of velopharyngeal competence following Furlow palatoplasty has long been supported.4,7,13–16
Further investigation is needed to identify the threshold of what constitutes a small gap or large closure ratio, suggestive of a successful outcome, and how often secondary DOZP is successful at varying preoperative gap sizes and closure ratios. In this study, we aim to investigate the threshold of preoperative velopharyngeal closure that allows the strongest prediction of successful DOZP and to describe DOZP outcomes at different preoperative velopharyngeal closure thresholds. In addition, we plan to assess other patient variables that may influence outcome, including levator knee position, posterior pharyngeal wall shape, soft palate length, and age at surgery.
Materials and Methods
Patients
This retrospective study was approved by our University's Institutional Review Board, IRB STU-2021-0854. Patients who were born with a cleft palate, either isolated or as part of cleft lip and palate (CLP), and who had VPI following primary palatoplasty, treated with revision Furlow DOZP between 2012 to 2021 were included. Four surgeons performed the DOZP surgery at a single center. Patients with syndromic diagnoses known to affect velopharyngeal function or non-cleft related VPI were excluded. Patients with multiple procedures at the time of secondary Furlow, including fistula repairs or a combination of DOZP with bilateral buccal flaps were also excluded as these other maneuvers could have an independent or confounding effect on speech outcome. Patients who did not have pre-operative speech videofluoroscopy imaging or postoperative speech assessment between three to 24 months postoperatively were also excluded from the study. Patients’ medical records were retrospectively reviewed to identify demographic data, speech assessments, surgical outcome and complications, and speech imaging data.
Surgical Methods
DOZP revision palatoplasty was the preferred technique for surgery for VPI when surgeons considered that there was a potential for this surgery to improve or resolve the patient's VPI. Exceptions to this were if the palate on speech videofluoroscopy had truly no movement on pressure sounds, or if the palate was so short that even with the Z-plasty lengthening it could not be expected to reach the posterior pharyngeal wall. The variants of the DOZP performed by surgeons in this study were very similar. The Z-plasty was designed from the hard-soft palate junction to the base of the uvula. The lateral extents of the Z-plasty were marked at the pterygoid hamulus and these points were connected with almost straight lines, incorporating a slight curve to avoid an overly narrow flap tip.
Dissection was undertaken according to surgeon preference with one surgeon favoring the Bovie needle tip and Freer elevator, two favoring scissor dissection and one favoring dissection with the 15 blade using the operative microscope. The flaps were raised as an oral myomucosal flap on the left side, and a nasal mucosal flap on the left side, and an oral mucosal flap together with the mucous glands, and a myomucosal flap on the right side. No lateral relaxing incisions were used. The muscle repair was also undertaken according to surgeon preference with 3 surgeons overlapping muscle without a suture-based repair, and one surgeon partially mobilizing the muscle from the adjacent mucosa and undertaking an end-to-end repair of levator veli palatini. If the uvula demonstrated any dehiscence, then it would be split and a uvuloplasty performed. If the uvula was found to be intact then one of the surgeons would leave it intact, and the others would routinely split the uvula and undertake uvuloplasty.
Perceptual Speech Assessment
As part of the regular standard of care, participants’ speech was assessed perceptually by one of our three experienced Craniofacial speech-language pathologists (SLP) using the Cleft Audit Protocol for Speech-Augmented-Americleft Modification (CAPS-A-AM) speech rating system. 17 For older children, the American English Sentence Sample, along with a brief conversational sample was obtained. 18 For younger or less verbal children, a conversational sample along with word or phrase imitation tasks were used. Ratings were conducted by three SLPs who participate in regular consensus listening sessions using the CAPS-A-AM. Hypernasality was used as the primary outcome measure to assess the success of the surgery. 19 For the CAPS-A-AM, a hypernasality rating of zero indicates nasality that is normal for the region while a rating of one (borderline/minimal) suggests a minimal or inconsistent increase in nasal resonance. A rating of two (mild) implies hypernasality that is evident on vowels with a high tongue posture while a three (moderate) indicates hypernasality that is perceived across all vowels. A rating of four (severe) signifies that hypernasality is evident in voiced consonants as well as all vowels. 17 Preoperative ratings were compared to postoperative ratings acquired between three and 24 months after surgery. For patients with multiple postoperative ratings between three and 24 months after surgery, the latest speech sample rating was used. The rationale for this time interval and protocol was to capture patients who demonstrate speech improvement at the typical time point of 6 months post-operatively, as well as those who continue to improve beyond this time point and who were reassessed at 12 or even 18 months post-operatively. Surgical success was defined as either a postoperative hypernasality rating of zero or one, or an improvement from the preoperative rating by a value of two or more. Patients with speech ratings that did not meet these criteria were considered to have had an unsuccessful outcome. Of note, some speech samples occurred before the utilization of the CAPS-A-AM speech protocol. In those cases, speech recordings were reviewed and rated according to the CAPS-A-AM ratings. 19
Speech Videofluoroscopy Measurements
Speech videofluoroscopy, which allows for dynamic visualization of velopharyngeal movements from the lateral and anteroposterior views, is obtained as part of the preoperative evaluation process for patients with VPI at our institution. Briefly, successive x-ray images of the patient producing a carefully curated speech sample that has been screened for compensatory speech errors are captured, similar to methods previously described.20,21 The lateral fluoroscopy videos were analyzed frame by frame to obtain two representative frames: the palate at rest and the palate at maximum elevation during phonation of an oral pressure consonant without interference from the tongue or during a swallow (Figure 1). Due to magnification and non-linear X-ray sources, measurements on videofluoroscopy frames do not represent real-scale length measurements, however, the relative measurements of structures are accurate and reliable. 21 The time between speech videofluoroscopy acquisition and VPI surgery was less than 1 year for 92 (84%) patients. The maximum time was 3.9 years.

Palatal measurements using videofluoroscopy in the sagittal view: (a) distance from levator knee to posterior pharyngeal wall at rest, (a’) distance from levator knee to posterior pharyngeal wall at maximum closure, (b) length of soft palate with (b’) position of levator knee, (c) length from the back molar to the porion, and (c’) length of the posterior pharyngeal wall to porion.
Three features were recorded from these two frames: i) velopharyngeal closure percentage (VCP), ii) levator knee position, and iii) posterior pharyngeal wall shape. The VCP was calculated as the difference between the activated and resting velar position as a percentage of the resting gap size using the formula
ImageJ was used to calculate measurements for VCP, levator knee position, and posterior pharyngeal wall shape. A random number generator was used to select 10 patients from the cohort to calculate inter-rater reliability. The three features were measured by two raters for these 10 patients, and the intraclass correlation coefficient (ICC) between the raters’ measurements was calculated. The remaining measurements were obtained by a single user.
Statistical Analysis
A paired t-test was used to compare preoperative and postoperative hypernasality ratings on the CAPS-A-AM. Student t-tests assuming unequal variances were performed on continuous variables: age at surgery (years), time between primary palatoplasty and Furlow (years), Levator Knee Position (%), Posterior Pharyngeal Wall Shape (%), length of soft palate (mm), and VCP (%) against Furlow outcome (successful vs. unsuccessful). Patients missing the date of primary palatoplasty in their medical records were excluded from the analysis of time of primary surgery to revision DOZP. Chi-square statistics were performed on categorical variables: type of cleft palate (bilateral, unilateral, isolated), patient sex (female, male), postoperative complications (presence, absence), and Passavant's Ridge (presence, absence) against Furlow outcome.
Stepwise logistic regression analyses were performed to analyze the predictors of Furlow DOZP outcome. Continuous and categorical variables listed above were analyzed as possible predictors. A Pearson correlation coefficient was used to evaluate the relationship between VCP and age at Furlow, time from primary palatoplasty to Furlow, length of the soft palate, Levator Knee Position, and Posterior Pharyngeal Wall Shape. Interactions among these variables were also entered as predictors in the logistic regression that met the significant level of ≤ 0.05. An odds ratio with a 95% confidence interval and the Hosmer and Lemeshow test, which is used to determine goodness of fit with the model, were used.
A receiver operating characteristic (ROC) curve, which shows the performance of a binary classifier as its threshold is varied, was generated. In this study, the ROC curve is the True Positive Rate vs. the False Positive Rate for successful vs. unsuccessful outcomes as the preoperative VCP is varied from 0% to 100%. From the ROC curve, a Youden Index (J), which maximizes both sensitivity and specificity, was calculated using the formula
For all statistics, a P-value of ≤ .05 was considered statistically significant. All statistical analyses were performed using SAS software version 9.4 (SAS Institute Inc).
Results
Patients
There were 154 patients with repaired cleft palate and subsequent VPI who underwent revision Furlow palatoplasty. After exclusion as described in the Methods section, a total of 110 patients were included in this study. The ratio of females to males was 53:57. The occurrence of bilateral CLP, unilateral CLP, and isolated cleft palate was 25%, 45%, and 30%, respectively, which is likely related to increased risk of VPI in patients with bilateral cleft lip/palate.1,22,23 The median age at Furlow operation was 7.4 (range 3-19.5) years, with a median time between primary palatoplasty and Furlow palatoplasty of 5.8 (range 1.8–17.5) years. The Furlow palatoplasty operations performed resulted in 94 successful and 16 unsuccessful outcomes, resulting in an 85% overall success rate among our surgeons by the VPI resolution criteria defined in this study. Complications were minimal: minor wound complications including dehiscence and delayed healing occurred in 14 (12.7%) patients. All these healed spontaneously except for two (1.8%) patients who developed a fistula, of which one required re-operation.
Perceptual Speech Assessment
Hypernasality ratings significantly improved after Furlow palatoplasty from an average preoperative CAPS-A-AM score of 3.15 ± 0.77 (mean ± SD), which falls between the moderate and severe hypernasality rating, to a postoperative score of 0.96 ± 1.14, which falls between the normal and mild hypernasality rating (P < .0001).24,25 (Figure 2)

Summary of preoperative and postoperative CAPS-A-AM hypernasality scores for patients who underwent Furlow palatoplasty using the traffic light system. Each row represents a single patient.
Speech Videofluoroscopy Measurements
The ICCs for the 10 patients between the two reviewers were 0.99 for VCP, 0.93 for levator knee position, and 0.93 for posterior pharyngeal wall shape. Kappa or ICC inter-rater agreement above 0.90 is considered excellent, and therefore remaining measurements were taken by a single rater.
Predictors of Furlow DOZP Outcome
Within the successful outcome group, the mean ± standard deviation VCP was 82 ± 18%, compared to the unsuccessful group with 60 ± 25.5%. The student's t-test showed that the VCP had a statistically significant relationship with DOZP outcome (P < .0001) (Table 1). Age at surgery, length of the soft palate, levator knee position, and posterior pharyngeal wall shape were not significantly related to Furlow DOZP outcome (Table 1).
Student's t-Test Comparing Predictor Variables with Furlow DOZP Outcome.
Abbreviation: DOZP, double opposing Z-plasty.
*Represents statistical significance.
Chi-square statistics showed no significant relationship between Furlow DOZP outcome and cleft type, patient sex, presence of Passavant's ridge, or presence of postoperative complications (Table 2).
Chi-Square Statistics to Compare DOZP Outcome.
Abbreviation: DOZP, double opposing Z-plasty.
Percent Closure as a Predictor of Furlow Outcome
The candidate predictor variables, including all continuous and categorical variables (Table 1 and Table 2), were entered into a stepwise regression as possible predictor variables. VCP was the only variable to meet statistical significance criteria and was therefore entered into the model. It was found that VCP is a strong predictor of DOZP outcome (P < .0001; OR 0.958; CI 0.935-0.982). The Hosmer and Lemeshow goodness-of-fit test resulted in a chi-square value of 6.49 (Pr > chi-square 0.59), indicating a good fit for the model.
Preoperative Velopharyngeal Closure Cutoff Value
The ROC curve had an area under the curve (AUC) of 0.76. The maximum Youden index (J) was 0.48, which determined a cutoff (c*) of 55% preoperative VCP or greater. At this cutoff value, sensitivity is 0.91 and specificity is 0.56 (Figure 3).

(A) sensitivity and specificity vs. Velopharyngeal Cutoff Percentage (VCP) cutoff values. (B) Receiver operating characteristic curve to determine Youden's index based preoperative VCP cutoff value of 55%.
Discussion
In this study, we have corroborated previous findings indicating that preoperative VCP is a strong predictor of successful Furlow palatoplasty in patients with cleft for correction of hypernasality,4,7,13,14 with our data showing a statistically significant relationship between VCP and successful vs. unsuccessful outcomes. Grouping by VCP ranges demonstrates that surgical success with Furlow palatoplasty increases directly with preoperative VCP, with an average success rate of 93.8% for patients with preoperative VCP of > 90% (n = 45).
While our study supports the traditional view that DOZP is a good approach for patients with a high VCP, interestingly, patients with below 50% closure still had above a 50% success rate following surgery with DOZP (Table 3). In general, the current literature suggests that DOZP has a lower complication rate and less morbidity compared to the more obstructive speech surgeries (pharyngeal flap and sphincter palatoplasty), however, its effectiveness weens as preoperative VCP lessens.7,26–28 This study suggests that offering DOZP for correction of VPI even with VCP under 50% could be a reasonable first-line treatment as there is a roughly even chance of resolving VPI with the potential benefit of avoiding the negative long-term consequences of obstructive sleep apnea. A technique that has increased in popularity over recent years is the use of buccinator myomucosal flaps. While this technique can help with VPI and also has a low risk for obstructive sleep apnea, it has a higher moderate and severe complication rate than DOZP. 29
Success Rate of Furlow Palatoplasty Based on Preoperative VCP.
Abbreviation: VCP, Velopharyngeal Cutoff Percentage.
Successful outcomes occurred in 85% of our patients overall. This value is in line with previously recorded values of postoperative speech outcomes in similar patient cohorts, which range from 56-89% of patients having adequate speech after Furlow palatoplasty.15,16,30,31 This wide range of reported success could be due to differences in defining and quantifying the speech parameters used for reporting cleft speech outcomes. 32 An analysis of the technique of primary palatoplasty prior to the DOZP revealed that, of patients who underwent primary Bardach 2-flap palatoplasty, 20 of 25 patients (80%) had successful DOZP and, of patients who underwent a primary Sommerlad-style palatoplasty, 16 of 18 patients (89%) had successful DOZP. The other 67 patients had primary surgery by surgeons at other institutions and information regarding operative technique was limited or the technique was neither of the two previously presented techniques. In this group, 58 of 67 (87%) patients had successful DOZP. Additional variables such as surgeon experience, variation in surgical technique, and other patient-specific factors that cannot be controlled may also contribute to the rate of successful DOZP outcomes.
In contrast to other studies that investigated VCP with nasopharyngoscopy,4,13,14 this study used speech videofluoroscopy images from the sagittal plane (Figure 1) to measure preoperative VCP. The VCP is calculated based on the difference from resting phase to maximum pharyngeal wall closure, however, nasopharyngoscopy uses different planes to obtain this ratio. There may not be a fully equivalent or even a linear relationship between VCP calculated from the sagittal plane of a videofluoroscopy and a similar measurement calculated from nasopharyngoscopy. The sagittal video-fluoroscopy viewpoint limited our ability to determine the velopharyngeal closure pattern, however previous studies that have investigated this variable have not determined any significant finding related to outcome.4,13–15 Speech videofluoroscopy is part of the standard evaluation process for patients with VPI at our institution, but our study's findings demonstrate that VCP calculated by this method can be useful in the selection of patients who may benefit from secondary DOZP.
VCP was the only statistically significant predictor of successful DOZP outcome in this study (Table 1), as in other studies,4,14 with higher average VCPs correlating to increased success rates, as shown in Table 3. Our analysis did not reveal any statistically significant correlation between successful outcomes and patient variables such as cleft type, sex, presence of Passavant's ridge, or postoperative complications, which is consistent with several other studies.4,7,14 As we and several other studies have determined the VCP to be the only reliable predictor of outcome, it can be inferred that patients with high preoperative VCP may be good candidates for Furlow palatoplasty and patients who do not meet this criteria should be considered for other methods of VPI surgery.4,14
Previous studies have used a static measure of velopharyngeal gap size during maximal closure on phonation to represent the dynamic process of speech.7,15 VCP represents a dynamic measurement by comparing the rest phase to phonation. Prior studies have based their prediction of successful VPI correction into broad categories such as “small” and “large” gaps. This study further defines surgical outcome in terms of preoperative VCP groups (Table 3). Although the VCP may provide more precise and valuable information for the dynamic speech process, the same trend of large closure ratio (ie, small gap) leading to higher success rates in terms of VPI correction holds throughout all studies.
Prior work by Zhang et al. reported a cutoff value of > 52.5% to predict positive Furlow palatoplasty outcome in non-syndromic patients with submucous cleft palate, 14 while work by Wu et al. determined the cutoff value of 85% in a similar patient population. 4 This study reported a cutoff (c*) of > 55% VCP. Similar to Wu et al., this model had only moderate diagnostic ability, and the dataset had a small sample size at lower preoperative VCP values, which limits any definitive conclusions that can be made. Further prospective trials with a larger sample size would be needed to confirm and further optimize this model, however, it may be used as a starting point to stratify patients and determine the best surgical plan on a per-patient basis. The studies by Zhang et al. and Wu et al. were completed using nasopharyngoscopy for assessment rather than speech videofluoroscopy as in the current study. This speaks to the reproducibility of the VCP measurement using alternate VPI exams.
Our model determined a cutoff (c*) of > 55% preoperative VCP using the maximum Youden Index, which occurred where sensitivity is 0.91 and specificity is 0.56 on the ROC curve. The model had an AUC of 0.76, where an AUC of 1 would be a test with perfect sensitivity and specificity, representing a ‘moderate’ diagnostic ability in terms of predicting patients who will have successful Furlow palatoplasty outcomes using our model. The cutoff, determined using the Youden Index, provides the best tradeoff between sensitivity and specificity. However, this definition may not be the optimal threshold within a clinical context. It may be worth considering what sensitivity and specificity mean in terms of criteria for surgical outcome. In the more common context of these terms: a test where people are screened for a disease, the ‘disease’ positivity is surgical success following DOZP, and the ‘screening test’ is the VCP above or below a cutoff value:
A ‘True Positive’ is a child with VCP above cutoff value and a successful surgery. A ‘False Positive’ is a child with VCP below cutoff value and a successful surgery. A ‘True Negative’ is a child with VCP below cutoff value and an unsuccessful surgery. A ‘False Negative’ is a child with VCP above cutoff value and an unsuccessful surgery.
Using these 4 options, sensitivity and specificity can be evaluated as for any test. Within the clinical context of secondary DOZP for correction of VPI, one may consider weighting for specificity to minimize unsuccessful surgery, and better select patients who could benefit more from a different type of speech surgery. This tradeoff between sensitivity and specificity of the model can be visualized in Figure 3A.
Limitations of this study include its retrospective nature and single-center design. Additionally, a small sample size, especially at lower preoperative VCP, limited the success of our model. In the future, prospective studies can be used to further analyze Furlow palatoplasty success rates in correction of VPI using the preoperative VCP of >55%. Many surgical methods to improve speech function are available, and this work will be useful in preoperatively selecting those patients best suited for this method to improve surgical outcomes and overall patient-specific care.
Conclusion
This study utilized a retrospective review of patients with cleft palate who underwent Furlow palatoplasty to confirm that increased VCP correlates with the success of Furlow palatoplasty in the correction of VPI. Furthermore, this study elucidates more specific information about what defines “small” vs. “large” gaps and preoperative closure ratios and suggests a cutoff point of preoperative VCP. Patients with a preoperative VCP of > 55% may be considered good candidates for Furlow palatoplasty, while patients who fall below this cutoff point may risk poor outcomes in terms of VPI resolution and may benefit more from different surgical methods. Patients with low preoperative VCP values, however, still had a greater than 50% success rate. This study found that preoperative VCP directly correlated with DOZP success.
Footnotes
Acknowledgements
The authors would like to acknowledge Dr. Joan Reisch for statistical analysis.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This retrospective study was approved by our University’s Institutional Review Board, IRB STU-2021-0854.
