Abstract
Study Design:
Description and evaluation of a novel surgical training platform.
Objectives:
The purpose of this study was to investigate the face, content, and construct validity of 5 novel surgical training models that simulate freehand and percutaneous (minimally invasive surgery [MIS]) pedicle screw placement.
Methods:
Five spine models were developed by residents: 3 for freehand pedicle screw training (models A-C) and 2 for MIS pedicle screw training (models D and E). Attending spine surgeons evaluated each model and, using a 20-point Likert-type scale, answered survey questions on model face, content, and construct validity. Scores were statistically evaluated and compared using means, standard deviations, and analysis of variance between models and between surgeons.
Results:
Among the freehand models, model C demonstrated the highest overall validity, with mean face (15.67 ± 5.49), content (19.17 ± 0.59), and construct (18.83 ± 0.24) validity all measuring higher than the other freehand models. For the MIS models, model D had the highest validity scores (face, content, and construct validity of 11.67 ± 3.77, 18.17 ± 2.04, and 17.00 ± 3.46, respectively). The 3 freehand models differed significantly in content validity scores (P = .002) as did the 2 MIS models (P < .001). The testing surgeons’ overall validity scores were significantly different for models A (P = .005) and E (P < .001).
Conclusions:
A 3-dimensional-printed spine model with incorporated bone bleeding and silicone rubber soft tissue was scored as having very high content and construct validity for simulating freehand pedicle screw insertion. These data has informed the further development of several surgical training models that hold great potential as educational adjuncts in surgical training programs.
Introduction
The past decade has seen a paradigm shift in the way neurosurgical residents are trained. Since the implementation of duty-hour restrictions by the Accreditation Council for Graduate Medical Education in 2003, many pressures have arisen to challenge the traditional apprenticeship model of neurosurgical residency. Decreasing case volumes, increasing oversight by attending physicians, and continued advancement in the complexity and diversity of procedures have all coalesced to create an environment that makes it increasingly difficult for residents to acquire all of the skills necessary for independent practice within 7 years. 1,2 In other surgical subspecialties, such as interventional cardiology, some reports have suggested that graduating residents may need additional training before being able to function independently as faculty, and these fields are actively exploring the use of procedural simulators. 3 In the face of these mounting obstacles, there has been a new surge of interest in developing and incorporating surgical simulation into the training of neurosurgical residents. 4 -7 One method of creating surgical simulations that has gained a great deal of interest in recent years is 3-dimensional (3D) printing. 8 -10
Three-dimensional printing—a form of additive manufacturing that was created in the 1980s—uses layer-by-layer deposition of liquid thermoplastic material (fused deposition modeling) or solidification of photosensitive liquid polymer (stereolithography) to create 3D models of structures that have traditionally been confined to 2-dimensional projections on computer screens or in textbooks. It has found widespread and rapid adoption among several surgical fields, ranging from otorhinolaryngology to orthopedic surgery. 8 -12 As a method of creating specimens for surgical simulation, it has many advantages over other simulation modalities. Unlike traditional cadaveric simulations, 3D-printed models are less expensive, easier to maintain, and capable of reproducing a range of pathological conditions. Moreover, 3D-printed models have the advantage of inherent haptic feedback, unlike many computer-based surgical simulators. In neurosurgery specifically, 3D printing has been used for purposes ranging from hollow elastic replicas of aneurysms for aneurysm clipping simulation to custom, patient-specific intracranial electrode arrays. 13,14
Although 3D printing has been received enthusiastically by several neurosurgical subspecialties, the field of spine surgery has experienced a relative deficit of research into the use of this novel modality for simulation and training. 15 -17 The Barrow Biomimetic Spine—a 3D-printed spine model that was developed by residents and shown to mimic many of the biomechanical and fluoroscopic qualities of cadaveric spines—was created to help address this deficiency by providing trainees with an inexpensive, realistic spine model that can be adapted and modified to replicate a range of normal and pathologic anatomy. 18,19 This model has been demonstrated in previous studies to provide high-fidelity gross anatomy, fluoroscopic anatomy, biomechanical performance of pedicle screws, and range of motion for short spinal segments. 18,19 These properties are prerequisites for any high-fidelity training model of pedicle screw insertion, one of the key surgical skills required of any spine surgeon.
In this article, we investigate the face, content, and construct validity of several lumbar versions of the Barrow Biomimetic Spine created to simulate freehand and fluoroscopically guided percutaneous (minimally invasive surgery [MIS]) pedicle screw placement.
Methods
Construction of the Pedicle Screw Training Models
A 0.8-mm-resolution computed tomogram of a normal lumbar spine was converted into a 3D rendering using Materialise Mimics software (Materialise, NV, Leuven, Belgium). The complete L3-L5 vertebral segments were extracted from this rendering and converted into stereolithography (STL) file format (Figure 1). The STL file was imported into the Simplify3D software package (Simplify3D, LLC, Blue Ash, OH, USA).

Three-dimensional rendering of the L3-L5 vertebral bodies taken from a computed tomogram of a patient with normal lumbar spine anatomy. Used with permission from Barrow Neurological Institute, Phoenix, Arizona.
Five different variations of an L3-L5 spinal model were designed for practicing pedicle screw placement: Models A to C were designed for practicing freehand screw placement, and models D and E were designed for fluoroscopically guided percutaneous screw placement. Models A (freehand technique) and D (percutaneous technique) included a 3D-printed synthetic soft tissue material in addition to the 3D-printed L3-L5 vertebral segments. For these models, the STL file of the L3-L5 vertebral bodies was imported into the Autodesk Meshmixer software package (Autodesk, Inc, San Rafael, CA, USA). The L3-L5 vertebral rendering was then digitally placed inside a square block of separate material using the Meshmixer software. For model A, this block of material rose to the level of the transverse processes to simulate a typical surgical exposure during an open posterior approach (Figure 2). For model D, this block of material completely encased the L3-L5 segment to provide a typical surgical exposure for percutaneous screw placement (Video 1). Each of these new components was then imported back into the Simplify3D software package, where printer settings including extruder temperature, build-plate temperature, print speed, and print resolution were optimally set for a dual extrusion print using multiple materials. The vertebral bodies were printed using a thermoplastic material called acrylonitrile butadiene styrene. For models A and D, the synthetic soft tissue was printed using Gel-Lay (MatterHackers, Inc, Foothill Ranch, CA, USA), a 3D-printable filament that is hard when dry, and very soft and pliable (like soft tissue) after being soaked in tap water for 24 hours. Models A and D were printed using a FlashForge Creator Pro 3D printer (FlashForge Corp., Zhejiang, China).

(A) Screenshot from 3D-printing plan for model A. The L3-L5 vertebral bodies are arranged in the correct anatomical orientation and placed inside a block of material that will simulate the soft tissue exposure of an open posterior approach. (B) Photograph of the 3D-printed model that resulted from the printing plan in Figure 2A. Used with permission from Barrow Neurological Institute, Phoenix, Arizona.
Models B (freehand technique), C (freehand technique), and E (percutaneous technique) were composed of a 3D-printed L3-L5 segment and a synthetic soft tissue material composed of a silicone rubber called Oomoo (Smooth-On, Inc, Macungie, PA, USA). The vertebral bodies for these models were imported into Simplify3D and then printed using acrylonitrile butadiene styrene on a FlashForge Creator Pro. Models B and C then had Oomoo silicone rubber poured around them up to the level of the transverse processes to provide a typical open posterior view of the vertebral anatomy (Figure 3). Model E was completely encased in Oomoo silicone rubber to simulate a percutaneous pedicle screw technique. Model C was further modified by having an intravenous bag of red-colored saline plugged into the vertebral bodies to simulate bony bleeding during freehand pedicle screw insertion (Video 2).

Photograph of model B, which contains the 3D-printed L3-L5 vertebral bodies (green) set into a silicone rubber (pink) up to the level of the transverse processes to show the anatomical landmarks typically visible during an open posterior approach. Used with permission from Barrow Neurological Institute, Phoenix, Arizona.
Overall material costs for the models were not prohibitive, as 3D printers and thermoplastic filaments are becoming increasingly affordable. The total manufacturing costs of each model are as follows (prices are total overall cost in US$ for each model, including cost of the 3D printers, surgical resident and engineering student time spent designing and manufacturing the models, and material cost of each model): model A cost $48.00/model, model B cost $62.00/model, model C cost $70.00/model, model D cost $50.00/model, and model E cost $65.00/model.
Study Participants
Participants of this study were 4 staff neurosurgeons of Barrow Neurological Institute (Phoenix, Arizona, USA). Three surgeons each performed the freehand insertion of pedicle screws in all 3 freehand models (authors SWC, JDT, and UKK). Two surgeons each performed the MIS insertion of pedicle screws on both MIS models (authors JDT and JSU). One surgeon participated in both the freehand and MIS study components (JDT). Institutional review board approval was not sought because no patients were involved in this study.
Procedures
Models were tested for freehand and fluoroscopically guided percutaneous (MIS) pedicle screw placement. The testing surgeons worked on each of the models in random order. For the models simulating freehand pedicle screw insertion, the surgeons were not allowed to use fluoroscopy during screw placement but were provided all other surgical instruments typically used during freehand screw placement, including high-speed surgical drills and suction tubing for the bleeding model (model C). The surgeons were also asked to purposefully breach at least 1 pedicle during screw placement to identify what a breached pedicle felt like with the ball-tipped probe and when placing a breached pedicle screw. Fluoroscopy was used after testing was complete to confirm accurate screw placement in all the freehand models. For the MIS models, surgeons were provided a C-arm fluoroscopy unit and all other surgical instruments needed to perform a percutaneous screw placement. Fluoroscopy was again used to check for screw accuracy after testing was complete. Immediately after testing of each model, participants responded to a survey on their experience, which was used to assess face, content, and construct validities of the tested models. Participants were asked to fill out each survey before moving on to the next model. The surveys were standardized for the freehand and MIS models.
Validity Assessments
Face validity refers to the ability of the model to superficially recreate the scenario meant to be tested. In other words, how well do the models replicate the environment and feel (both visual and tactile) of a surgical procedure in which pedicle screws are being placed with a freehand or MIS technique? Content validity refers to the ability of the models to test the surgeons’ technique performing those pedicle screw insertion techniques. In other words, are the models actually testing freehand and MIS pedicle screw placement techniques, or are they somehow testing some other skill, such as ability to obtain accurate fluoroscopic views of the spine? Construct validity is a measure of how well the models test what they are purportedly testing. In other words, does successfully performing a freehand or MIS pedicle screw placement on the model translate to one’s ability to successfully perform the same technique on a living patient? These forms of validity have been previously established for other surgical simulations. 20
Participants of the freehand insertion study completed a 12-question survey on objective measures. Classes of 3 questions, 6 questions, and 3 questions formed the face, content, and construct validity assessments, respectively (Supplementary Material A). Questions 1 to 11 provided objective measures of the tested models, with responses scored on a 20-point Likert-type scale where a response of 1 indicated the most negative response to the question, a response of 10 indicated a neutral response to the question, and a response of 20 indicated the most positive response to the question. For example, question 1 asks, “How do you think the model replicates freehand/percutaneous pedicle screw insertion?” A response of 1 for this question would mean the surgeon thought the model did the worst possible job replicating pedicle screw insertion, and a response of 20 would mean the surgeon thought the model did the best possible job of replicating pedicle screw insertion. Question 12 asked the surgeon if they thought the model tested anything other than free-hand pedicle screw insertion technique, and if the answer was “yes” the surgeon was asked what he/she thought it was testing.
Participants of the MIS insertion study completed a 13-question survey on objective measures. Classes of 3 questions, 6 questions, and 4 questions formed the face, content, and construct validity assessments, respectively (Supplementary Material B). Questions 1 to 12 provided objective measures of the tested models, with responses scored on the same 20-point Likert-type scale as noted above. Question 13 asked the surgeon if they thought the model tested anything other than percutaneous pedicle screw insertion technique, and if the answer was “yes” the surgeon was asked what he or she thought it was testing.
Class mean (SD) scores for face, content, and construct validity questions were determined for each model. Mean Likert scores for each class of validity were then categorized into 1 of 3 groups: poor validity for mean scores of <10, moderate validity for mean scores of 10 to 15, and high validity for mean scores of >15. Analysis of variance (ANOVA) was performed for comparison of the validity scores for each class between models. ANOVA was also performed for overall scores for each of the surgeons to determine the level of agreement between participants on the validity of each model as a training tool for pedicle screw placement. All statistical analyses were performed using SPSS (Version 22, IBM Corp, Armonk, NY, USA). P values <.05 were considered statistically significant.
Results
Participant Demographics
Four staff neurosurgeons from Barrow Neurological Institute performed the freehand (n = 3) and MIS (n = 2) pedicle screw insertions on the biomimetic models. The mean (±SD) length of neurosurgical experience for the freehand group was 13.3 ± 4.7 years. The mean length of neurosurgical experience for the MIS group was 14.0 ± 8.5 years.
Face Validity
In the freehand study, the class mean (±SD) scores for face validity were 14.11 ± 2.50, 14.11 ± 3.98, and 15.67 ± 5.49 for models A, B, and C, respectively. ANOVA for comparison of all scores for each freehand model demonstrated no significant difference between models in face validity scores (P = .69).
In the MIS models, the class mean scores were 11.67 ± 3.77 and 11.00 ± 2.78 for models D and E, respectively. ANOVA for these models demonstrated no statistically significant difference between models for face validity scores (P = .69). See Tables 1 and 2 for a summary of validity scores and ANOVA analyses for freehand and MIS models, respectively.
Freehand Model Validity Testing Results.a
a Data is mean ± SD unless otherwise indicated. Face, content, and construct validity scores were all calculated by taking the mean of all questions asked within each validity category. For example, questions 1 to 3 asked about the model face validity. The scores reported above are the mean and SD of the combined scores for all the testing surgeons, for all 3 questions asking about face validity.
Minimally Invasive Surgery Model Validity Testing Results.a
a Data is mean ± SD unless otherwise indicated. Face, content, and construct validity scores were all calculated by taking the mean of all questions asked within each validity category. For example, questions 1 to 3 asked about the model face validity. The scores reported above are the mean and SD of the combined scores for all the testing surgeons, for all 3 questions asking about face validity.
Content Validity
In the freehand study, the class mean (±SD) scores for content validity were 16.11 ± 1.11, 18.33 ± 0.97, and 19.17 ± 0.59 for models A, B, and C, respectively. ANOVA for comparison of each freehand model demonstrated a statistically significant difference in overall scores for content validity (P = .002; Table 1).
In the MIS study, the class mean scores were 18.17 ± 2.04 and 13.75 ± 0.27 for models D and E, respectively. ANOVA for comparison of each MIS model demonstrated a significant difference in overall scores for content validity (P < .001; Table 2).
Construct Validity
In the freehand study, the class mean (±SD) scores for construct validity were 16.83 ± 1.18, 17.83 ± 0.24, and 18.83 ± 0.24 for models A, B, and C, respectively. ANOVA for comparison of each freehand model demonstrated no statistically significant difference in overall scores for construct validity (P = .18; Table 1).
In the MIS study, the class mean scores were 17.00 ± 3.46 and 13.00 ± 3.77 for models D and E, respectively. ANOVA for comparison of each MIS model demonstrated a trend but no significant difference in overall scores for construct validity (P = .06; Table 2).
Surgeon Agreement on Overall Model Validity
ANOVA was performed to compare overall scores of all attending surgeons for each model. In the evaluation of freehand model A, the ANOVA demonstrated a statistically significant difference between rater scores (P = .005), indicating significant disagreement between attending surgeons on the overall validity of this model. ANOVA for models B and C did not demonstrate a statistically significant difference between surgeon scores of overall validity (P = .07 and P = .49, respectively). In the MIS models, the participating surgeons showed no statistically significant disagreement on their validity scoring of model D (P = .36) but did disagree on the overall validity of model E (P < .001).
Validity Assessment for Each Model
Using the validity classifications of low, moderate, and high validity (based on validity class means of <10, 10-15, and >15, respectively), the validity of each model can be described and compared to each other in the following terms. Among the freehand models, model A demonstrated moderate face validity, and high construct and content validity. Model B demonstrated equally moderate face validity as model A but higher content and construct validity than model A. Model C was the model with vascularized bone and simulated blood set in silicone rubber, and it had the highest face, content, and construct validity of all 3 freehand models. The face validity of this model barely reached the threshold for “high validity” with a class mean (±SD) score of 15.67 ± 5.49, whereas the content validity of this model was nearly perfect with a score of 19.17 ± 0.59. The construct validity was also rated very high at a mean of 18.83 (±0.24).
The MIS models were rated as follows (class mean ± SD): model D (vertebral bodies printed into Gel-Lay) had moderate face validity at 11.67 ± 3.77, high content validity at 18.17 ± 2.04, and high construct validity at 17.00 ± 3.46. Model E (vertebral bodies encased in silicone rubber) was the least valid of all training models, with a moderate mean face validity of 11.00 ± 2.78, a moderate content validity of 13.75 ± 0.27, and a moderate construct validity of 13.00 ± 3.77.
Discussion
In the past 2 decades, surgical simulation has become an increasing focus of neurosurgical training, and recent technological advances have enhanced the effectiveness of these models. 4,20 -23 In particular, 3D printing has become popular for the creation of realistic neurosurgical models. 13 However, there has been a relative void in the assessment of printed spinal segments for simulated surgery.
In this study, several prototypes of the Barrow Biomimetic Spine were evaluated for efficacy in the simulation of pedicle screw placement. Previously established validation measures—face, content and construct validity—were used for assessment. 20 The evaluating spine surgeons reported that model C had the highest validity in each class of face, content, and construct validity for all 5 models, with content and construct validity achieving nearly perfect scores. The testing surgeons felt that this model combined the best of surgical feel of the bony structures on pedicle cannulation and pedicle screw placement, the best mechanical feel when palpating both successfully cannulated pedicles and breached pedicles, and the best simulation of surgical workflow as the bleeding bone required the surgeon to continually work with a sucker in one hand in order to control the bleeding and show the pertinent anatomy for freehand screw placement technique.
Between the 2 MIS models, model D (vertebral bodies printed into a block of Gel-Lay) achieved the highest scores for face, content, and construct validity, though these scores were all lower than the highest scoring model in the freehand group. The primary complaint from each surgeon working on the MIS models was that the soft tissue material was very unrealistic regarding its feel and handling. The Gel-Lay was an appropriate softness but too fibrous, whereas the silicone rubber was too hard and difficult to dissect with the percutaneous screw placement instruments. The silicone rubber used in model E was also found on fluoroscopy to have a much higher density than the 3D-printed vertebral body, which created a negative image of the bony spinal anatomy on fluoroscopy (Figure 4). This issue adversely affected the training experience of the surgeons when performing fluoroscopically guided percutaneous screw placement as it made the bony anatomy more difficult to see and much less realistic. The Gel-Lay in model D, on the other hand, was much less dense than the 3D-printed vertebral bodies, thereby providing a much more realistic view on fluoroscopy (Figure 5). This situation likely accounted for model D’s higher overall scores compared with model E.

Radiographs of model E that demonstrate the negative projection of the vertebral bodies that results from using silicone rubber as a synthetic soft tissue. (A) Anteroposterior view of model E with a Jamshidi needle (Becton Dickenson and Co, Franklin Lakes, NJ, USA) placed in the right L3 pedicle. (B) Lateral view of the same model with a pedicle screw in the right L3 pedicle. Used with permission from Barrow Neurological Institute, Phoenix, Arizona.

Radiographs of model D that demonstrate the higher fidelity radiographic views of the vertebral bodies when using the Gel-Lay as a synthetic soft tissue. (A) Lateral view of a percutaneous pedicle screw placement in the left L4 pedicle. (B) Anteroposterior view of the same model with a pedicle screw being placed in the left L4 pedicle. Used with permission from Barrow Neurological Institute, Phoenix, Arizona.
Interestingly, for each of the 5 models, the face validity was scored the lowest, whereas the content validity was scored the highest. This finding reflects a potential area of improvement for the models in terms of replicating a superficially realistic surgical environment and testing platform. Since the writing of this article, new work has begun on creating a more realistic soft tissue and operating room environment in hopes of increasing the face validity of these models. We have also expanded the models to include synthetic thecal sacs and electrically conductive nerve roots. Models of the cervical spine are also under development and include synthetic vertebral arteries in addition to the bony and ligamentous tissue of the spinal column and synthetic thecal sac. Using the data generated in this study as a baseline, we are better able to move forward with the development of much more valid surgical training models for use in surgical education. These models have also gained increasing use among the residents at the authors’ institution, and the difficulty and surgical skill content of the models can be modified to suit senior or junior-level residents. For example, models of scoliotic curves that comprise the bony and ligamentous structures of the spinal column are increasingly used by more senior residents with interests in spinal deformity correction techniques. The junior residents have found greater utility with shorter segment models modified with bleeding bone, synthetic thecal sacs, and conductive nerve roots, as these models are much better at developing basic surgical techniques such as blood loss management, pedicle screw placement, laminectomy, and dural repair. Our findings have enabled the further development of numerous surgical spine models that are increasingly used by neurosurgery residents interested in improving their surgical skills outside of the operating room. Models like these are likely to become more important to residency and fellowship training programs in the new era of work-hour restrictions.
There are some key limitations to this study. The first is that the face, content, and construct validity were all measured based on subjective impressions of expert spine surgeons. More objective measures of construct validity have previously been reported in similar studies evaluating surgical skills models. 20 Given the intended goal of this study was to establish baseline validity scores for 5 very different models in order to inform further development, we felt subjective construct validity scoring was appropriate. Furthermore, more rigorous validity testing is currently planned for the next generation of these models. Another limitation is the small sample size. Three surgeons for the freehand models and 2 surgeons for the MIS models do not provide a wide breadth of opinion. We felt the small sample size would suffice given that we intended to broadly define the validity of these models, both overall and compared with each other, to guide our future development efforts.
Conclusion
With the introduction of the 80-hour resident work week, improvement of off-site tools for resident education are in increasing demand. In neurosurgical practice, 3D-printed models for simulation of spinal procedures hold great promise. The Barrow Biomimetic Spine, with incorporated vascularization, improves on existing synthetic spinal models in face, content, and construct validation measures. With the current state of 3D-printing technology, models such as this should become an increasingly valuable standard in resident training.
Supplemental Material
Supplementary_Material - The Barrow Biomimetic Spine: Face, Content, and Construct Validity of a 3D-Printed Spine Model for Freehand and Minimally Invasive Pedicle Screw Insertion
Supplementary_Material for The Barrow Biomimetic Spine: Face, Content, and Construct Validity of a 3D-Printed Spine Model for Freehand and Minimally Invasive Pedicle Screw Insertion by Michael A. Bohl, Rohit Mauria, James J. Zhou, Michael A. Mooney, Joseph D. DiDomenico, Sarah McBryan, Claudio Cavallo, Peter Nakaji, Steve W. Chang, Juan S. Uribe, Jay D. Turner and U. Kumar Kakarla in Global Spine Journal
Footnotes
Acknowledgments
The authors thank the staff of Neuroscience Publications at Barrow Neurological Institute for assistance with manuscript preparation.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
The supplemental material is available in the online version of the article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
