Abstract
Digital slides created by whole-slide imaging scanners can be evaluated by pathologists located in remote sites, but the process must be validated before this technology can be applied to routine cytological diagnosis. The aim of this study was to validate a whole-slide imaging scanner for cytological samples. Sixty cytological samples, whose diagnoses were confirmed by gold-standard examinations (histology or flow cytometry), were digitalized using a whole-slide imaging scanner. Digital slides and glass slides were examined by 3 observers with different levels of cytopathological expertise. No significant differences were noted between digital and glass slides in regard to the number of cases correctly diagnosed, or the sensitivity, specificity, or diagnostic accuracy, irrespective of the observers’ expertise. The agreements between the digital slides and the gold-standard examinations were moderate to substantial, while the agreements between the glass slides and the gold-standard examinations were substantial for all 3 observers. The intraobserver agreements between digital and glass slides were substantial to almost perfect. The interobserver agreements when evaluating digital slides were moderate between observers 1 and 2 and between observers 1 and 3 while they were substantial between observers 2 and 3. In conclusion, our study demonstrated that the digital slides produced by the whole-slide imaging scanner are adequate to diagnose cytological samples and are similar among clinical pathologists with differing levels of expertise.
Keywords
Whole-slide imaging (WSI) allows the user to digitalize an entire glass slide, and the digital slide can be evaluated using a web viewer or directly on the scanner site. 2,26 Thanks to the possibility of examining digital slides at various magnifications and with a high-quality image, the WSI system is now considered superior to both static and robotic telepathology. 17
Diagnostic cytopathology is widely used in veterinary medicine as it is a minimally invasive technique, with a rapid turnaround time, and is cost-effective. 3,8 Depending on the type of lesions and the level of expertise of the pathologist reading the slides, cytological diagnosis can reach high diagnostic accuracy. 21 In other cases, cytology allows only a preliminary diagnosis or may indicate further testing. 3 It is also used for prognostic purposes and to monitor a patient’s response to therapy. 1 Because cytology sampling techniques are minimally invasive and diagnostic imaging is widely available, this procedure can allow sampling of internal organs from critically ill patients. 13 The introduction of WSI technology in referral diagnostic laboratories may provide several advantages: (1) it reduces the risk of broken or lost glass slides during the shipping process, if the clinical pathologist works remotely; (2) it facilitates the collaboration between clinical pathologists, when second opinions are needed; and (3) it allows more flexible work, since the clinical pathologist requires only a computer with a high-speed internet connection.
The focusing process of the majority of WSI scanners is based on a limited number of focus points on the glass slide. For this reason, the application of WSI to cytological samples can be problematic because smears are often not uniformly thick and 3-dimensional, as thick clusters of cells could be present. 10 This problem has been overcome by the Z-stack function implemented on the latest WSI scanners. With this technology, the digital slides are scanned at different fields of focus, and later the software merges the acquired images to create a multiplane composite image. 10
In human medicine, WSI technology has been validated for histopathological, 4,5,16,20,23 cytological, 9,11,14,22,24,25,27 and hematological 12 samples. In veterinary medicine, a WSI scanner has been validated for the histopathological diagnosis of cutaneous tumors in dogs, 6 and the agreement between WSI and optical microscopy for evaluating cytological samples of canine lymphoma has been evaluated. 7 The aim of the present work is the diagnostic validation of digital cytology in the veterinary cytopathological field, following the guidelines proposed by the Pathology and Laboratory Quality Center of the College of American Pathologists. 18 In particular, the following aspects were investigated: (1) the diagnostic agreement between digital slides and the gold-standard examinations, (2) the intraobserver agreement between digital slides and glass slides, (3) the interobserver agreement between observers with different expertise, and (4) the diagnostic performance of the digital slides in the diagnosis of a neoplastic process.
Materials and Methods
Specimens and Participants
The archive of the Laboratory of Histopathology and Cytopathology of the Department of Comparative Biomedicine and Food Science of the University of Padua was searched to identify the cytological cases whose diagnoses were confirmed by gold-standard examinations (histology or flow cytometry in case of lymphoid tumors).
The glass slides were stained with May-Grünwald-Giemsa stain by an automated stainer (Leica Autostainer XL; Leica Microsystems, Wetzlar, Germany), and for each case, a board-certified clinical pathologist who did not participate in the evaluation process selected the slide with the highest cellularity and the best preservation, with no staining precipitates and a percentage of naked nuclei or disrupted cells considered acceptable for the tissue sampled. Thus, for each case of this study, the same slide was used for the digitalization and for the examination with the light microscope.
The glass slides were digitalized using a WSI scanner (D-sight; A. Menarini Diagnostics S.r.l., Bagno a Ripoli [FI], Italy) to obtain the digital slides. The WSI scanner used for this study was equipped with 4×, 10×, 20×, and 40× objectives, and the digitalization of the glass slides was performed using the 40× objective with automated tissue detection and focus point assignments. To avoid areas being out of focus, the Z-stack modality of acquisition with a 7-line scan of the same area at different depths of focus was used. The digital slides were uploaded to a server and evaluated by the observers at least 1 month after the evaluation of the glass slides to allow for a washout period. The digital slides were also randomly enumerated differently to the corresponding glass slide. Both the washout period and the different enumeration were intended to minimize recall bias.
Examination of Slides
Three observers with different expertise participated in the study, all in blind fashion to the results of the gold-standard examinations. The lowest level of cytological experience was represented by a postdoctoral researcher (observer 1), the intermediate level of cytological experience was represented by a clinical pathologist board-certified for 5 years (observer 2), and the highest level of cytological experience was represented by a clinical pathologist board-certified for 17 years (observer 3). Each observer evaluated both the glass slide and the digital slide for each case. The observers evaluated first all the glass slides in randomized order within a period of 1 week, then after the wash out period, they evaluated all the digital slides in randomized order, using different identifiers from the glass slides.
Digital slides were evaluated through a web viewer (Telepathology; Visia imaging S.r.l., San Giovanni Valdarno [AR], Italy). The web viewer allowed the evaluation of the digital slides with different magnification (from 40× to 400×), but it was not possible to focus up and down or to increase the brightness of the digital slides. Each observer used his or her own laptop computer and monitors without any color calibration. Glass slides were evaluated with the observer’s own light microscope at 400× magnification. For each cytological case, the species, sex, age, and sampled organ were provided to the observers. The observers classified the findings from the glass slides and digital slides in a predefined diagnostic list (Table 1). In addition, based on the diagnosis provided by each observer, the digital slides were further classified as “neoplastic,” “nonneoplastic,” or “nondiagnostic.”
Classification of Cytological Diagnoses and the Number and Percentage of Cases Correctly Diagnosed by 3 Observers, based on Digital Slides and Glass Slides Compared to the Gold-Standard Results.a
Abbreviations: DS, digital slide; GS, glass slide; NA, not assessed.
a Observers 1, 2, and 3 were observers with a lower, intermediate, and higher level of cytological experience, respectively.
b Mixed inflammation: percentage of neutrophils lower than 85% and presence of 2 or more inflammatory populations.
Data Analysis
For all the cases, the agreement between the cytological diagnoses (using either the digital slides or the glass slides) and the gold-standard examinations, the intraobserver agreement between digital slides and glass slides, and the interobserver agreement for both digital slides and glass slides were assessed using the Cohen’s κ test. The κ coefficients were interpreted as recommended by Landis and Koch: 15 <0.00, poor; 0.00 to 0.20, slight; 0.21 to 0.40, fair; 0.41 to 0.60, moderate; 0.61 to 0.80, substantial; and >0.80, almost perfect. To assess sensitivity, specificity, and diagnostic accuracy of digital slides for differentiating between neoplastic and nonneoplastic lesions, the “nondiagnostic” cases were excluded, and the digital slide and glass slide results were classified on the basis of agreement with gold-standard examinations into these categories: true positive, true negative, false positive, and false negative. The 95% confidence interval (CI) was calculated for the agreement between cytological diagnoses and the gold-standard examinations, the intraobserver agreement between digital slides and glass slides, the interobserver agreement for both digital slides and glass slides, and the diagnostic performance of the digital slides.
Fisher’s exact test was used to assess differences in the percentage of correct diagnoses and sensitivity, specificity, and diagnostic accuracy between digital slides and glass slides, as well as between observers when evaluating digital slides. All values were considered significant when P < .05.
For statistical analysis, MedCalc Statistical Software version 15.8 (MedCalc Software bvba, Ostend, Belgium) was used. The raw data are available in Supplemental Table S1.
Results
Over a 9-year period, a total of 93 cytological cases with a definitive diagnosis obtained by histopathology or flow cytometry were identified. Thirty-three of 93 were discarded due to the poor preservation and/or low cellularity and 60 were enrolled in the study. The samples were collected from different organs of 48 dogs and 12 cats ranging in age from 1 to 18 years. Based on the gold-standard examinations, the cases included 14 malignant epithelial neoplasms, 8 malignant mesenchymal spindle cell neoplasms, 6 lymphomas, 8 mast cell tumors, 4 histiocytic disorders, 1 plasmacytoma, 1 malignant melanoma, and 2 germ cell tumors (total: 44 neoplastic cases), as well as 7 epithelial hyperplasias or benign epithelial neoplasias, 1 degeneration (hepatic hydropic degeneration), 2 neutrophilic inflammations, and 6 mixed inflammations (total: 16 nonneoplastic cases) (Suppl. Table S2).
In the evaluation of digital slides, observer 1 correctly diagnosed 39 of 60 (65%) cases, observer 2 correctly diagnosed 43 of 60 (72%) cases, and observer 3 correctly diagnosed 44 of 60 (73%) cases. In the evaluation of the glass slides, observer 1 correctly diagnosed 39 of 60 (65%) cases, observer 2 correctly diagnosed 44 of 60 (73%) cases, and observer 3 correctly diagnosed 47 of 60 (78%) cases (Table 1). No significant differences were found in the percentage of correct diagnoses obtained by the 3 observers evaluating digital slides and glass slides (0.3 ≤ P ≤ 1, Fisher’s exact test) and in the percentage of correct diagnoses obtained by the 3 observers evaluating digital slides (P = 1, Fisher’s exact test).
The agreement between the diagnoses obtained with digital slides and the gold standard was moderate for observer 1 and substantial for observers 2 and 3, while the agreement between the diagnosis obtained with glass slides and the gold standard was substantial for all 3 observers (Table 2).
Agreement (κ Values and 95% Confidence Intervals) Between Diagnoses Based on Digital Slides or Glass Slides vs Gold-Standard Examinations and Intraobserver Agreement Between Digital and Glass Slides.a
Abbreviations: DS, digital slide; GS, glass slide.
a The intraobserver agreement was assessed by using linearly weighted Cohen’s κ. The κ coefficients were interpreted as recommended by Landis and Koch. 15 Observers 1, 2, and 3 were observers with a lower, intermediate, and higher level of cytological experience, respectively.
The intraobserver agreement was substantial for observer 1 and almost perfect for observers 2 and 3 (Table 2).
The interobserver agreement in the evaluation of the digital slides was moderate between observers 1 and 2 and between observers 1 and 3 (κ = 0.55 and 0.48) and substantial between observers 2 and 3 (the most experienced observers; κ = 0.70). The interobserver agreement in the evaluation of the glass slides was moderate between observers 1 and 3 (κ = 0.54) and substantial between observers 1 and 2 and between observers 2 and 3 (κ = 0.63 and 0.70) (Table 3).
Interobserver Agreement (κ Value and 95% Confidence Interval) for Digital and Glass Slides.a
Abbreviations: DS, digital slide; GS, glass slide.
a The intraobserver agreement was assessed by using linearly weighted Cohen’s κ. The κ coefficients were interpreted as recommended by Landis and Koch. 15 Observers 1, 2, and 3 were observers with a lower, intermediate, and higher level of cytological experience, respectively.
The sensitivity, specificity, and diagnostic accuracy in the identification of neoplasia for each observer are shown in Table 4. No significant differences were found in sensitivity, specificity, and diagnostic accuracy between digital slides and glass slides for the 3 observers (P = 1, Fisher’s exact test) and among observers in the evaluation of digital slides (P = 1, Fisher’s exact test).
Sensitivity, Specificity, and Diagnostic Accuracy (%, With 95% Confidence Intervals) of Digital Slides and Glass Slides in the Identification of Neoplastic Processes.a
Abbreviations: DS, digital slide; GS, glass slide.
a Observers 1, 2, and 3 were observers with a lower, intermediate, and higher level of cytological experience, respectively.
Discussion
In this study, we validated the use of digital slides for cytological diagnosis by evaluating the diagnostic agreement with the gold standard, the intraobserver agreement, the agreement between observers with different expertise, and the diagnostic performance of digital slides. These results show an excellent intraobserver agreement and a good agreement between the gold-standard examinations and the digital slides. Moreover, the diagnostic accuracy ranged from 83% to 87% depending on the observer’s cytological expertise. In veterinary medicine, the descriptive capability of WSI was previously evaluated by our research group in cytological samples of canine lymphoma. 7 In human medicine, most studies are focused on gynecological cytological samples, 9,11,22,27 and only 1 study included cytological samples from different organs. 14
Despite the increased interest of pathologists in WSI technology, relatively few studies focus on its application in the field of cytology. The digitalization of cytological samples is challenging due to the uneven distribution of cells that are often arranged in 3-dimensional clusters. Moreover, 1 single cytological case is usually composed of multiple slides, which may result in an increased scanning time and storage space for the digital slides. With the advent of the Z-stack function in the latest WSI scanners and the decrease in the price for storage servers, the application of WSI to cytological samples is now more accessible, and its use will likely grow in the coming years. In this study, all the glass slides were scanned using the 40× objective with the Z-stack modality, since a previous article focusing on cervicovaginal cytological samples reached the best accuracy using this modality of acquisition. 27 As a result, none of the digital slides had out-of-focus areas; therefore, it was not necessary to rescan any glass slide.
The scanner used in this study was not equipped with a 100× objective; therefore, the diagnostic performance of the digital slides acquired with this high-power objective could not be assessed. The mean scanning time for single digital slides using the 40× objective is approximately 3 hours, and the size of the file is approximately 2.5 gigabytes, depending on the number of cells on the glass slides. Therefore, it may be that for a 100× objective, both scanning time and the size of the digital slides would increase significantly and may not be suitable for routine diagnostic use with present technology.
The overall percentage of correctly diagnosed samples using the digital slides and glass slides was similar, ranging from 65% to 73% and from 65% to 78%, respectively. The more experienced observers (observers 2 and 3) had a higher percentage of correctly diagnosed samples in evaluating glass slides compared to the digital slides, while observer 1 had the same percentage when using both techniques. These percentages are lower compared to the results obtained by House and colleagues 14 ; however, their classification was limited to “positive for neoplasia,” “negative for neoplasia,” and “other” instead of 14 different possible diagnoses included in this study. Moreover, the lag time between the digital slide and glass slide evaluation in House and colleagues 14 was only 3 days, while the washout period proposed for the validation of a WSI system is at least 2 weeks. 18 Therefore, it is possible that the high percentage of cases correctly diagnosed was influenced by a recall phenomenon. 14 The percentage of correctly diagnosed samples in this study was different among the different categories: mast cell tumors, malignant melanoma, and neutrophilic inflammation were correctly diagnosed by all the observers with digital slides and glass slides. These lesions have well-recognized morphological features, which were easily identified by all observers with both techniques. On the contrary, the percentages of histiocytic disorders correctly diagnosed with both digital slides and glass slides were low for observer 1, reflecting the difficulty in differentiating reactive macrophages or dendritic cells from the neoplastic counterpart for the less experienced observer. The percentage of samples with mixed inflammation correctly diagnosed was above 65% for both digital slides and glass slides only for observer 3, while the percentage was below 35% for the other observers. Inflammation was considered mixed when the percentage of neutrophils was lower than 85% and when 2 or more inflammatory populations were present. 19 Observers 1 and 2, when evaluating both digital slides and glass slides, frequently misclassified mixed inflammation as neutrophilic inflammation, plasmacytoma, or malignant spindle cell neoplasm. A possible reason is because neutrophils and plasma cells are frequently part of the mixed inflammatory response, 19 and reactive spindle cells due to fibroblastic response were interpreted as malignant. The low percentage of correctly diagnosed mixed inflammation could be related to the presence of these cells in the samples that misled the 2 observers with lower cytological expertise.
Infectious agents were not present in any sample, so it is not possible to determine if the resolution of the digital slides is adequate for the identification of these pathogens. Identification of bacteria requires the 100× objective, which was not present in the scanner tested. A 40× objective might be sufficient for identification of a large number of microorganisms, although this was not evaluated in this study. On the contrary, when the microorganisms are scarce or intracellular (eg, Leishmania spp), the 100× objective is recommended.
The agreement between digital slides and the gold-standard examination was moderate for observer 1 and substantial for observers 2 and 3, while the agreement between glass slides and the gold standard was substantial for all observers. A possible explanation is that the web viewer does not allow the observer to focus up and down or to increase the brightness of the digital slides. The lack of these 2 functions could have affected the diagnostic process of the less experienced observer when evaluating the digital slides, while the other 2 observers were less conditioned, reaching a good agreement using both the digital slides and the glass slides. Nevertheless, the difference in Cohen’s κ value was minimal for observer 1, and the agreement between the gold-standard examination and the digital slides was acceptable, irrespective of the cytological expertise.
The intraobserver agreement is the most important parameter to consider in the validation process of a WSI system. 18 In this study, it was substantial for the less experienced observer, while it was almost perfect for the 2 board-certified clinical pathologists. The intraobserver agreements of this study are higher compared with those reached in a previous study focused on cellblock preparations from papaicolaou test (PAP) sample evaluation 24 and to those reached by our research group in the determination of lymphoma grade. 7 These results obtained by the 2 board-certified clinical pathologists highlight the importance of experience in the evaluation of cytological samples, even if the less experienced observer reached good results.
To evaluate the reproducibility of the cytological results and the contribution of the operator’s experience in the diagnostic process, the interobserver agreement for both the digital slides and the glass slides was evaluated. There was substantial agreement between the 2 experienced observers (observers 2 and 3) for both methodologies, whereas it was lower between the experienced and the less-experienced observers, especially when evaluating digital slides. These data support a good reproducibility of the results for operators with the same training, irrespective of the technology used. A limitation is that a single less-experienced and only 2 more-experienced observers were compared, which limits the evaluation of the role experience.
The overall sensitivity, specificity, and diagnostic accuracy in the recognition of a neoplastic process were higher for glass slides compared to digital slides, but the differences were not significant. The highest sensitivity was achieved by observer 1, while observer 3 had the highest specificity with both technologies.
One possible limitation of the application of WSI to cytological samples is related to the long scanning time at 40× magnification with the Z-stack modality, which is necessary to obtain digital slides. In addition, in contrast to histological samples, 1 cytological case is usually composed of multiple glass slides. Thus, with the present technology, the use of WSI is hampered for routine cytological practice and may be more appropriate for the evaluation of specific challenging cases when the opinion of a skilled cytologist is required.
In conclusion, we demonstrate that digital slides obtained via this WSI scanner can be used for the evaluation of cytological samples. This result is particularly useful for challenging cases, since sharing digital slides is easier and faster than glass slides. Moreover, clinical pathologists can discuss cases remotely. Nevertheless, based on the good results in terms of reproducibility and diagnostic performance using digital slides obtained by all the observers, the WSI technology can be used by clinical pathologists with all level of expertise.
Supplemental Material
Supplemental Material, DS1_VET_10.1177_0300985818825128 - Diagnostic Validation of a Whole-Slide Imaging Scanner in Cytological Samples: Diagnostic Accuracy and Comparison With Light Microscopy
Supplemental Material, DS1_VET_10.1177_0300985818825128 for Diagnostic Validation of a Whole-Slide Imaging Scanner in Cytological Samples: Diagnostic Accuracy and Comparison With Light Microscopy by Federico Bonsembiante, Ugo Bonfanti, Francesco Cian, Laura Cavicchioli, Beatrice Zattoni and Maria Elena Gelain in Veterinary Pathology
Footnotes
Acknowledgements
We thank Patricia O’Reilly for her writing assistance.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The WSI scanner (D-sight; A. Menarini Diagnostics S.r.l.) has been acquired thanks to the scientific instrumentation grant program (2012) of the University of Padua.
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
