Abstract
Introduction
In laparoscopic surgery, surgeons need to use psychomotor abilities that substantially differ from those used in conventional surgery (eg, different eye-to-hand coordination, conversion of three-dimensional to two-dimensional images, altered tactile feedback, and the fulcrum effect). Before being able to perform laparoscopic surgery on patients, these psychomotor abilities need to be trained.1,2 However, the traditional training of residents, based on an apprenticeship-based model of teaching in the operating room (OR), 3 can be time consuming, 4 costly, 5 and cause potential harm for patients. 6 Moreover, in laparoscopy, residents often may only manipulate 1 or more fixed instruments—resulting in little opportunity to practice actual laparoscopic maneuvers during the operation.7,8
Therefore, alternative methods for training laparoscopy have been developed, such as box trainers, practicing on live animals or cadavers, and virtual reality (VR) simulation. VR offers great potential to improve and maintain technical and nontechnical skills outside the OR. It allows for a more flexible controlled environment without supervision, free of pressure of operating on patients, and without exposing patients to unnecessary risks.9,10 In the last decade, training using VR simulation has widely spread in surgical training curricula. Successful completion on VR simulators is nowadays often required for residents to perform laparoscopic surgery in real practice.12,13 The use of simulators for improving and maintaining laparoscopic skills is well supported by evidence.9,11 Skills acquired on a VR simulator are transferable to actual medical practice.9,10,14-17
The Simendo® VR Simulator (Simendo B.V., Rotterdam, The Netherlands) is a laparoscopic VR simulator aimed at improving laparoscopic skills, for example, orientation, eye-to-hand coordination, precision of instrument handling, and (bi)manual movement in nonanatomic models. Previous studies have demonstrated face and construct validity for several basic and advanced exercises.18,19
Although the previous curriculum included advanced exercises, there is a need for an even higher level of complexity. Therefore, a new curriculum was developed, consisting of a variety of new exercises especially focused on ambidextrous skill development. Ambidextrous skill development requires a relatively high skill level. It provides a new challenge for residents who have succeeded the previous curricula. The purpose of this study was to determine face and construct validity for the newly developed set of exercises for training and assessment of advanced bimanual laparoscopic skills on this VR simulator.
Materials and Methods
Study Design
This prospective, multicenter, cohort study was conducted at the University Medical Center Utrecht, the Academic Medical Center Amsterdam, and the Erasmus University Medical Center Rotterdam at the gynecology, general surgery, and urology departments.
Curriculum
The new “Bimanual Fundamentals” curriculum was developed during the past years prior to this study, in cooperation with medical specialists, to achieve an appropriate level of difficulty and realism. It consists of 5 advanced exercises (Figure 1), especially focused on training and maintaining ambidextrous skills and cooperation between left and right instrument. The exercises are named “ (A) Exercise “Sort the Rings,” (B) exercise “Stretch and Transfer,” (C) exercise “Ring and Rope,” (D) Simendo VR laparoscopy simulator, (E) exercise “Balance,” and (F) Exercise “Puzzle.” Images provided and copyrighted by Simendo®. Overview of the 5 Exercises of the “ A collision was registered when the instrument hit the edge of the ring-collecting bins. A collision was registered when a peg touched the elastic cord or another peg. A collision was registered when the instrument hit a ring with force. Displacement was registered when putting a weight on the scale, while the scale was not balanced.
Recruitment of Participants
Participants were recruited on a voluntary basis from the participating centers. Participants consisted of medical students (years 4-6), residents (postgraduate years (PGY) 4-6 gynecology, urology, and general surgery), and medical specialists (gynecologists, surgeons, and urologists). Based on the status and laparoscopic experience, 3 groups were formed: novices (medical students), intermediates (residents), and experts (minimal invasive surgery specialists). To be considered an expert, the medical specialist must have had extended experience with selected laparoscopic procedures (at least 50 times for 2 of the procedures or more than 100 times for 1 of the procedures). For gynecologists, the selected procedures consisted of laparoscopic hysterectomy, laparoscopic oophorectomy, and laparoscopic pelvic lymphadenectomy. For general surgeons, the selected procedures consisted of laparoscopic cholecystectomy, Nissen fundoplication, laparoscopic colectomy, and laparoscopic bariatric procedures. For urologists, the laparoscopic prostatectomy and laparoscopic nephrectomy were chosen as selected procedures. All participants were asked about their prior experience with laparoscopic skills training. Experience with box trainers, VR trainers, and live animal training was estimated in hours. Participants’ demographics and laparoscopic theater experience (in hours) were also evaluated.
Equipment
The Simendo® VR simulator for laparoscopic skills training was used. This simulator consists of a software interface and 2 hardware instruments (Figure 1D) connected with 2 USB plugs to a laptop computer (Asus ROG Strix GL553VW, AsusTek Computer Inc. Taipei, Taiwan). The laptop contained an Intel®Core™ i5-6300HQ CPU (2.30 GHz, 8 GB RAM; Intel Corporation, Santa Clara, USA), with an NVIDIA GeForce GTX 960M graphics card, 15.4″ full HD LCD display, and Microsoft Windows 10 software (Microsoft Corporation, Redmond, USA).
Face Validity
Face validity was defined as the degree of resemblance between the simulator and the laparoscopic procedure in real practice.20,21 To determine face validity, participants with experience in real practice were asked to complete a questionnaire immediately after completing the simulation procedure. The questionnaire contained 15 questions about the realism of the simulator, training capacities in general, and the suitability for training residents or surgeons. Additionally, questions were asked about the 5 exercises separately (6 questions per exercise). Questions were presented on a 5-point Likert scale. 22 The last 2 questions regarded statements comparing the “Bimanual Fundamentals Curriculum” with the existing “Intermediate Curriculum” and concerning the implementation of the VR simulator in the current residency training programs. These could be answered with “agree,” “disagree,” or “no opinion.”
Construct Validity
Construct validity was defined as the simulator’s ability to differentiate subjects with different levels of skills. 18 Since this simulation addresses technical skills, it is expected that it differentiates between experienced and nonexperienced performers. In simulation validation studies, construct validity usually refers to the ability of the simulator to differentiate performance between surgical experts and novices. 23 Construct validity is considered to be necessary before using a simulator in surgical training curricula and is preferable before using the simulator as a training tool.
To determine construct validity, participants completed 3 consecutive repetitions of each of the 5 exercises of the new curriculum. Before starting a new exercise, an instruction video was shown and the test supervisor gave a brief standardized explanation. The first run was meant to familiarize with the simulator. Verbal instructions were given by the test supervisor when necessary. The second and third runs were used for analysis. No verbal or other instructions were given during the second and third runs. For each exercise, the following parameters were documented and compared between the different groups: task time, collisions/displacements, and path length right and left. Task time was measured in seconds; it was determined as the time between retracting the instruments to start the run and completing the exercise. The number of collisions was obtained in exercises “Sort the Rings,” “Stretch and Transfer,” and “Ring and Rope.” Displacements were measured in the exercise “Balance” (Table 1). Collisions/displacements and path length (the distance covered by each instrument) were registered in arbitrary units.
To evaluate construct validity and get an objective score of all measured parameters combined, an overall performance score was calculated for each participant. For each parameter, in each exercise, quartile scores of the whole sample size were determined and used as cutoff points. The best 25% performers received 3 points and the worst 25% 1 point. Everyone in between (the middle 50%) received 2 points. Based on the number of measured parameters and exercises, the overall performance score ranged from 19 to 57 (Table 1).
Use of Statistics
Data were analyzed with the Statistical Package for the Social Sciences version 22 (SPSS, Chicago, USA). We used descriptive statistics to describe characteristics of participants and groups. We differentiated between groups with the use of the nonparametric Kruskal–Wallis. Comparison between performances of groups was undertaken with the use of the Mann–Whitney U test. To determine the minimum sample size, a power analysis was performed. A total sample of 45 participants (3 groups of 15 participants) achieved a power of .80 with the one-way independent ANOVA calculation and an estimated effect size of .5. A level of
Ethics
The study was reviewed and approved by the Dutch Society for Medical Education (NVMO) Ethical Review Board (number 1033 and date April 23, 2018).
Results
Characteristics of the Participants Divided Into Groups.
Abbreviation: PGY = postgraduate years.
In novices: current year of study. In intermediates: current year of residency. In experts: medical specialist since … years.
Extended laparoscopic experience was determined as at least 50 performances for 2 of the selected procedures or more than 100 times for 1 of the selected procedures. Selected laparoscopic procedures are hysterectomy, oophorectomy, and pelvic lymphadenectomy in laparoscopic gynecology; cholecystectomy, Nissen fundoplication, bariatric procedures, and colectomy in laparoscopic surgery; and prostatectomy and nephrectomy in laparoscopic urology.
Prior Experience
None of the novices had prior experience on box trainers, other VR simulators, or live animal training. Only 1 of them had used the Simendo VR simulator before (during a one-hour session). Almost all intermediates had prior experience on box trainers (n = 14) and the Simendo VR simulator (n = 14). About half of the intermediates had experience on other VR simulators (n = 6) or live animal training (n = 7). All experts had experience on box trainers and live animal training, and the majority had experience on the Simendo VR simulator (n = 13) and other VR simulators (n = 13). In general, experts were most experienced, except for the Simendo VR simulator, where intermediates were most experienced. Prior experience for each group is demonstrated in Figure 2. Prior experience on box trainers (A), Simendo VR simulator (B), other VR simulators (C), and live animal training (D) in hours.
Face Validity
Face Validity of the Bimanual Fundamentals Curriculum as a Whole.
Median scores (interquartile range) on a 5-point Likert scale. PGY = postgraduate years.
Face Validity of the Individual Exercises of the Advanced Curriculum (n = 49).
Median scores (interquartile range) on a 5-point Likert scale.
Construct Validity
Construct Validity of the New Advanced Curriculum (n =49).
Median scores (interquartile range), Abbreviations: R = right, L = left.
Task time in seconds; path length, collisions, and displacement in arbitrary units.
Construct Validity: Significance Levels Between Groups (n = 49).
The parameter collisions/displacements were measured in 4 of 5 exercises. It was not a relevant parameter in the “
Path length was obtained for both right and left instrument separately. Experts and intermediates significantly outperformed novices in all 5 exercises (Tables 5 and 6). When comparing experts with intermediates, a statistically significant difference was found for both path length left and path length right in the exercise “
Total Performance Score with Significance Levels (n = 49).
Median (interquartile range).
Discussion
Main Findings
The purpose of this study was to determine face and construct validity for the new “
We were able to establish decent construct validity by demonstrating statistically significant differences between performers with different levels of experience. In all 5 exercises, experts outperformed novices on the great majority of the measured parameters. In 2 exercises, the experts also outperformed the intermediates on various parameters. This was mainly the case in the more difficult exercises (ie., “
Explanation of Main Findings
Due to their relatively low scores, depth perception, the lack of haptic feedback, and interaction of the instruments with other objects were indicated as limitations of this particular simulator. Depth perception and interaction with objects are harder to achieve on a 2D interface and are a limitation in both VR simulating interfaces and real laparoscopic surgery. 25 During the development of software for this simulator, clear shadows and the use of different colors for different objects were added in order to improve both depth perception and interaction with objects. Nonetheless, participants rated both as mediocre. This might have been partly caused by the relatively small display that was used. A bigger display may further enhance depth perception.
The lack of haptic feedback is often stated as a major disadvantage of VR simulators in comparison with box trainers or training on live animals or cadavers. In this study, the participants experienced the lack of haptic feedback as moderately disturbing. Opinions and literature about haptic feedback in VR simulators are ambiguous. Some surgeons believe that haptic feedback is an important part of a VR simulator, 26 while others indicate that simulation outcome for exercises augmented with haptic feedback is likely to be inaccurate, resulting in a not better or even negative training effect.27,28 Recent studies found none or minor improvements in training effect using haptic feedback.10,29,30 Adding realistic haptic feedback is difficult, and it is usually an expensive add-on to VR simulators. 31 The evidence for transfer of skills using nonhaptic feedback VR simulators is well established.9,10,14-17 Therefore, haptic feedback seems, in the current technical state, not a cost-effective feature in VR simulators for minimally invasive surgery.
Regarding construct validity, we found that the parameters task time, collisions, and path length right and left are valuable parameters for performance assessment on this particular simulator because of the significant differences between groups. These metrics have also been validated in previous studies.32-34 The parameter displacements were only measured in the exercise “
It has been a point of discussion in previous studies that parameters (especially task time) on its own do not provide enough information to distinguish different levels of expertise.9,33 Therefore, we calculated an objective total performance score to assess participants’ performance with all measured parameters combined. As expected, both experts and intermediates achieved significantly higher scores than novices. Moreover, experts scored higher than intermediates, but the difference was too little to be statistically significant.
Overall, the discriminative abilities of the curriculum between experts and intermediates seemed small. This can be interpreted in several ways. First, the intermediate group consisted of senior residents only, advanced in their residency program (PGY 4-6). This designates a moderate to high level of laparoscopic experience. Second, the curriculum was developed for advanced laparoscopic skills training, whereas the current state of assessment systems is not able to adequately distinguish between levels of higher skill.
35
This finding is therefore not really unexpected and consistent with other studies aiming to validate simulators as training tools.36-38 Third, it can be argued that participants with previous experience on this specific simulator would perform better than others. To investigate if prior experience influenced a participants’ total performance score, a correlation analysis was performed. We indeed found a correlation of 17.6% (
Strengths and Limitations
A strength that should be mentioned is that we calculated a total performance score for each participant. This gave us the chance to compare participants’ total performance, instead of only comparing a single parameter. In real practice, technical expertise is also dependent on a combination of performing time efficiently, with economy of movement and without making mistakes.
Besides the unequal distribution of prior experience among groups, some other limitations should be noted. First, the relatively small sample size could be indicated as a limitation, but we performed a power analysis and reached a sufficient sample size. Moreover, our study was conducted at multiple medical centers. Therefore, we believe the relatively small sample size did not influence the generalizability of our results substantially.
Second, participants needed between 1 and 1.5 hours to complete the 5 consecutive exercises without a pause. Considering the increasing level of difficulty with each following exercise, this requires persisting concentration and perseverance. It may be possible that participants lost their focus or interest and did not perform at their best, especially in the last exercises. Since only a small number of participants complained about this, we believe this did not influence the results significantly.
Third, the setup of the simulator could have been more optimal. As stated before, the laptop that was used had a 15.4-inch display, which is relatively small. Also, it was not possible to adjust the height of the simulator instruments. For some participants, this could have been working to their disadvantage. This could also be seen as a strength since we maintained a controlled and identical height for every participant.
Fourth, there was only a significant difference between the intermediate and expert group in 2 exercises. An explanation could be that the intermediate group had more prior experience on this simulator (Figure 2.)
Future Research
Future research can determine whether prior experience on this simulator moderates the relationship between surgical expertise and performance on this VR simulator. Also, it would be interesting to add a PGY 1-3 group to the sample to further determine the simulators’ abilities to differentiate between moderate to high levels of skill.
Conclusion
We determined face and construct validity for the new “
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Supplemental Material
Footnotes
Acknowledgments
We would like to thank all the medical students, residents, and medical specialists who voluntarily participated in this study. We thank Maarten Debets from the Amsterdam University Medical Center for his help with the statistical analysis.
Author Contributions
Study conception and design: Henk W. R. Schreuder
Acquisition of data: Martijn P. H. van Ginkel and Wilhelmina M. U. van Grevenstein
Analysis and interpretation of data: Martijn P. H. van Ginkel, Henk W. R. Schreuder, and Marlies P. Schijven
Study supervision: Henk W. R. Schreuder and Marlies P. Schijven
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
