Abstract
Objectives
To assess readability and understandability of online materials for vocal cord leukoplakia.
Study Design
Review of online materials.
Setting
Academic medical center.
Methods
A Google search of “vocal cord leukoplakia” was performed, and the first 50 websites were considered for analysis. Readability was measured by the Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), and Simple Measure of Gobbledygook (SMOG). Understandability and actionability were assessed by 2 independent reviewers with the PEMAT-P (Patient Education Materials Assessment Tool for Printable Materials). Unpaired t tests compared scores between sites aimed at physicians and those at patients, and a Cohen’s kappa was calculated to measure interrater reliability.
Results
Twenty-two websites (17 patient oriented, 5 physician oriented) met inclusion criteria. For the entire cohort, FRES, FKGL, and SMOG scores (mean ± SD) were 36.90 ± 20.65, 12.96 ± 3.28, and 15.65 ± 3.57, respectively, indicating that materials were difficult to read at a >12th-grade level. PEMAT-P understandability and actionability scores were 73.65% ± 7.05% and 13.63% ± 22.47%. Statistically, patient-oriented sites were more easily read than physician-oriented sites (P < .02 for each of the FRES, FKGL, and SMOG comparisons); there were no differences in understandability or actionability scores between these categories of sites.
Conclusion
Online materials for vocal cord leukoplakia are written at a level more advanced than what is recommended for patient education materials. Awareness of the current ways that these online materials are failing our patients may lead to improved education materials in the future.
Keywords
The use of search engines is a common first step for patients seeking medical advice. If search results are not understandable, this impairs the patient’s ability to make informed choices and can negatively affect outcomes by potentially delaying care or reducing adherence with care plans. Unfortunately, many resources are written at a level that may not be understandable to most patients. The American Medical Association (AMA) and the National Institutes of Health (NIH) recommend that patient education resources be written at a sixth-grade level. 1 In light of this recommendation, many publications have evaluated the readability and quality of online medical education materials in disciplines including urology, 2 plastic surgery, 3 cardiology, 4 ophthalmology, 5 rheumatology, 6 and otolaryngology. 7
These analyses are increasingly important in certain fields where the speed of novel technological developments overtake what can typically be found in traditional print resources or when there is confusion concerning evaluation and treatment of a particular condition. Recent readability assessments within otolaryngology have focused on dysphagia, 8 in-office vocal fold injections, 9 and oropharyngeal cancer. 10 Regarding vocal cord leukoplakia, there remain active discussions within the medical community on malignant potential and desired degree of surgical care (biopsy vs complete microflap removal), as well as the role of nonoperative management, use of angiolytic lasers, and potential for office-based treatment.11-14 Correlations between pathologic classification and biologic behavior have been historically poor such that the World Health Organization recently simplified recommended pathologic categorization for vocal fold dysplasia. 15 Given the variations in care that patients may receive depending on the management approach that is recommended by the otolaryngologist, it is important for patients to have access to readable and understandable online materials to learn about available treatment options and participate in informed decision making. To our knowledge, no such readability analysis has been performed on patient education materials for vocal cord leukoplakia. Our aim in this study is to assess the readability and quality of online materials for vocal cord leukoplakia.
Methods
This review of online education materials was not considered human subjects research and was thus exempt from full review by the Johns Hopkins Medicine Institutional Review Board. A Google search was conducted with the search term vocal cord leukoplakia on March 8, 2021. As in other readability studies, Google was selected as it accounts for >70% of all internet searches, making it the most popular internet search engine.16,17
Websites that were advertisements, contained broken links, were not written in English, or were primarily images rather than text were excluded from analysis. Message boards, pages with <30 sentences of text, and image/video-based pages were also excluded. In addition, results that linked to academic research papers were excluded from analysis, as these were not original web content but instead links to published written materials. The top 50 search results identified in this search strategy were considered for review, as the quality of information is thought to decline after the top 50 results. 18 Search results that did not meet the criteria were excluded from analysis, as described in other readability studies.19,20
Once identified, the websites were designated as being oriented toward a patient or professional audience, as described in other readability studies within otolaryngology.8,9 Patient-targeted sources were overtly written to address patient audiences in language without technical medical jargon and/or were from medical clinics or hospital centers advertising services to patients. Professional-targeted sources were overtly written to educate and communicate with health care professionals and were often websites hosted by professional societies or online texts. For instance, professional-targeted sources contained information describing how to diagnose leukoplakia with videostroboscopy and discussed technical aspects of the surgical treatment of leukoplakia. The text of each identified site was archived for analysis, and the website address (uniform resource locator [URL]) and access date were recorded.
The readability of the websites was analyzed with the Flesch Reading Ease Score (FRES), the Flesch-Kincaid Grade Level (FKGL) readability test, and the Simple Measure of Gobbledygook (SMOG) Readability Formula, all of which were calculated with an online calculator. 21 These readability metrics use a combination of word count, sentence number, and syllables to derive readability scores.
A lower FRES corresponds to lower readability. Scores are generally between 0 and 100. The highest score possible (easiest readability) is 121.22 but only if every sentence consists of a single 1-syllable word. It is also possible to generate negative scores by including words with many syllables. 22 For context, the Harvard Law Review has a general readability score in the low 30s. Texts with scores between 90 and 100 are considered “very easy” to understand and are thought to be easily understood by a fifth-grade reader. Lower-FRES texts are progressively less easy to understand. 21
In contrast, with the FKGL and SMOG formulas, a lower score indicates easier readability. The result of the FKGL corresponds with a US grade level; for example, a result of 9.3 indicates a ninth-grade reading level. The SMOG formula calculates the number of polysyllabic words and converts this to a corresponding level of education needed to understand a piece of writing. 21
Understandability and actionability of each website were evaluated with the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P). The PEMAT-P is a validated 24-point measure developed by Shoemaker et al 23 for the Agency for Healthcare Research and Quality to evaluate understandability and actionability of patient education materials through criteria such as content, word choice, use of numbers, organization, layout/design, and use of visual aids. According to PEMAT-P, education materials are deemed understandable when readers of diverse backgrounds and varying levels of health literacy can process and explain key messages, and materials are deemed actionable when consumers of diverse backgrounds and varying levels of health literacy can identify what they can do based on the information presented. 23 Higher PEMAT-P scores correlate to more easily understood materials. Whereas the FRS, FKGL, and SMOG are calculated objectively, PEMAT-P scoring is subjective as various features of the website that contribute to understandability are graded as present or absent by reviewers. Because of the inherently subjective nature of PEMAT-P scoring methodology, PEMAT-P scores were independently calculated by 2 blinded reviewers (M.S. and G.E.S.). If reviewer scores differed by >10 points for a particular website, discrepancies were reviewed to resolve any inadvertent errors in scoring. Interrater reliability was assessed with Cohen’s kappa calculation per Microsoft Excel.
Each website was analyzed for HONcode certification. The HON Foundation (Health on the Net) is a nonprofit organization that strives to identify high-quality online health information through its HONcode certification process. The foundation is supported by the Economic and Social Council of the United Nations. Websites can apply for certification and are then evaluated against the foundation’s 8 key principles to determine eligibility: authority, complementarity, confidentiality, attribution, justifiability, transparency, financial disclosure, and advertising. 24 An HONcode toolbar was installed on the research team’s Google Chrome web browser, which automatically indicated whether each website possessed HONcode certification. Microsoft Excel was used to perform unpaired 2-sample t tests to compare PEMAT-P scores from websites with and without HONcode certification and to evaluate statistical significance in differences in PEMAT-P, FRES, FKGL, and SMOG scores between the patient- and physician-targeted websites. An a priori value <0.05 was set as the threshold for statistical significance. A correlation between understandability and FRES, FKGL, and SMOG was also calculated.
Results
Of the 50 websites reviewed, 28 were eliminated because they were online versions of published journal research articles (n = 25) or lacked sufficient text (n = 3). There were 22 sites included in final analysis: 5 physician- and 17 patient-oriented sites. None of the websites analyzed had a reading level of the sixth grade or lower, as recommended by the AMA and the NIH ( Table 1 ). The FRES, FKGL, and SMOG scores (mean ± SD) for the entire cohort of 22 websites were 36.90 ± 20.65, 12.96 ± 3.28, and 15.65 ± 3.57, respectively. These scores are at the “very confusing” range overall, with an expectation that readers would need a 12th- to 16th-grade education to comprehend the materials. The PEMAT-P understandability score was 73.65% ± 7.05% and the actionability score was 13.63% ± 22.47% ( Table 2 ). Cohen’s kappa, calculated to determine interrater reliability, was 0.89 (95% CI, 0.85-0.94), indicating an almost perfect degree of interrater agreement in assignment of PEMAT-P scores per the standards of Landis and Koch. 25
Search Results for Vocal Cord Leukoplakia Performed on March 8, 2021.
Abbreviations: FKGL, Flesch-Kincaid Grade Level; FRES, Flesch Reading Ease Score; SMOG, Simple Measure of Gobbledygook.
Comparison of Results for Patient- vs Physician-Targeted Websites.
Abbreviations: FKGL, Flesch-Kincaid Grade Level; FRES, Flesch Reading Ease Score; SMOG, Simple Measure of Gobbledygook.
Patient vs physician.
Comparison of patient- and physician-oriented sites for readability, understandability, and actionability measures is shown in Table 2 . Patient-oriented sites were statistically more readable than the physician-oriented sites, with P < .02 for comparisons across the FRES, FKGL, and SMOG measures, but overall scores for patient-directed websites still indicate readability to be difficult and at a 12th-grade level. PEMAT-P scores were not statistically different between patient- and physician-oriented sites.
Of the 22 websites included for analysis, only 6 (22.3%) were HONcode verified. Of these 6 sites, 5 were patient targeted and 1 was physician targeted. Readability, understandability, and actionability scores for HONcode-verified versus nonverified sites are shown in Table 3 , with no difference in scores when analyzed by HONcode status.
Results for HONcode-Verified Sites.
Abbreviations: FKGL, Flesch-Kincaid Grade Level; FRES, Flesch Reading Ease Score; SMOG, Simple Measure of Gobbledygook.
Correlations between readability grade level and PEMAT-P understandability were calculated (

Correlation between PEMAT-P understandability score and FKGL. FKGL, Flesch-Kincaid Grade Level; PEMAT-P, Patient Education Materials Assessment Tool for Printable Materials.

Correlation between PEMAT-P understandability score and SMOG. PEMAT-P, Patient Education Materials Assessment Tool for Printable Materials; SMOG, Simple Measure of Gobbledygook.

Correlation between PEMAT-P understandability score and FRES. FRES, Flesch Reading Ease Score; PEMAT-P, Patient Education Materials Assessment Tool for Printable Materials.
Discussion
Over 70% of adults seek health-related information online.26,27 Unfortunately, most websites containing patient education material are written at a reading level beyond what is easily understood by most patients. In this study, the 22 sites with online education materials related to vocal cord leukoplakia were written at readability levels above those recommended by the AMA/NIH. These results are consistent with many other readability studies that have been published within otolaryngology7,8,20,28-33 and other specialties.34-43
Similarly, PEMAT-P understandability and actionability scores were low. The understandability score of 73.65% ± 7.05% in this study is comparable but a bit higher than that of similar studies on other topics, which reported scores ranging from 62.8% to 66.0%.44-46 The actionability score in the current study of 13.63% ± 22.47% is comparable to scores in the literature.37,44,45 Although a high PEMAT-P score is better than a low score, there are no established guidelines for a PEMAT-P target to help guide writing. The low actionability score indicates that websites related to vocal cord leukoplakia did not adequately outline discrete steps that a patient could take in evaluation or management of one’s condition. To improve understandability and actionability, these websites might benefit from clear organization and discrete lists of actionable items to communicate next steps in care to patients.
Additional analysis compared patient- and physician-oriented websites. Although both categories had reading levels beyond what is recommended for online content, the patient-oriented websites were more readable than the physician-oriented sites based on FRES, FKGL, and SMOG scores. These results are consistent with other studies within otolaryngology.8,9,44 These results demonstrate an awareness that patient-oriented sites should be written in a way that is more easily read and interpreted. That these scores still fall short of readability standards suggests that even more deliberate care needs to be taken in creation of these online materials.
There were no significant differences in PEMAT-P scores of understandability or actionability between physician- and patient-oriented websites. Many other readability analyses within otolaryngology that used the PEMAT-P did not compare scores between patient- and physician-oriented sites. One study 47 did compare PEMAT-P results by authorship type and did not find differences in PEMAT-P scores among the groups (academic institutions, government agencies, websites from private practices, neutral web-based sites, and organizations such as nonprofits). Some studies have compared DISCERN scores (another measure to ascertain quality of websites) between patient and physician sources.8,9 These studies found a difference in DISCERN scores between patient- and physician-oriented websites in materials about in-office vocal fold injection 10 but not about swallowing difficulties. 9 One study 29 focused on nasal septoplasty and found that patient education materials originating from academic institutions had significantly higher scores in some DISCERN criteria than those originating from private clinics. Data from the current study support that there is no difference in understandability and actionability between sites about vocal cord leukoplakia directed at patients and those directed at physicians. Nevertheless, it might be that differences in PEMAT-P and DISCERN scores for these categories of websites may be attributed to whether the topic is related to a procedure rather than to a disease state. Procedural topics such as vocal fold injection or nasal septoplasty might offer more opportunity to present discrete step-by-step instructions to a physician audience than websites about a condition or complaint. More research will be needed to explore this hypothesis.
This study found that 22.3% of included websites were HONcode verified, which is comparable to rates of 8% to 40% reported in other readability studies in the otolaryngology literature.10,47 Although HONcode has been in existence since the early days of the internet, participation in HONcode is voluntary and participating websites must pay for certifications. This model may limit participation and account for the relatively low rate of HONcode verification.
Interestingly, our study did not find a higher level of readability, understandability, or actionability in websites that were HONcode verified. A study on readability in online materials regarding laryngeal cancer found no difference in readability between HONcode- and non-HONcode–verified sites. 33 HONcode is meant to addresses the reliability and credibility of information but is not focused on readability or understandability. HONcode also does not rate the quality of information provided on a website, though it does define rules meant to hold website developers to basic ethical standards in the presentation of information and to help ensure that readers always know the source and purpose of the data they are reading.
As mentioned earlier, there was a moderate correlation in this study between easier readability, as measured by FRES, FKGL, and SMOG, and improved understandability, as measured by PEMAT-P. Other studies20,28 have performed similar analysis and found similar correlations, although interestingly a readability study on spasmodic dysphonia 47 found no correlation between FKGL and understandability. It is uncertain why some topics might demonstrate correlation between easier readability and improved understandability and others do not, especially as understandability as based on PEMAT-P scoring has more to do with formatting, structure, and the like than it does with length of words or sentences used. However, that the correlation is modest suggests that any attempt to improve websites cannot focus only on a desire to improve readability—goals should independently encompass efforts at improving understandability and actionability as well.
There are some limitations in the present study, which are inherent to all studies that evaluate readability and understandability of patient education materials. The readability formulas utilized were designed to analyze narrative texts rather than medical literature. Consequently, they were not intended to measure the readability of medical jargon, which can be more complicated in content than other narratives despite similar syllable counts or word length. Along this line of reasoning, the FRES, SMOG, and FKGL do not take into account shorter words that are of a higher reading level or are more difficult to understand. Although a limitation, this actually serves to reinforce that the majority of sources are too complex. Last, cohesion between sentences is an important factor in readability, which is not factored into the formulas.
Conversely, it is possible that the readability formulas could overstate complexity of the websites. For instance, the term leukoplakia contains 5 syllables, so the repeated use of this word on a website could lead to a higher level of complexity as calculated with the readability formulas. The term otolaryngology is similarly polysyllabic and may create an increase in syllable-per-word calculations as compared with ear, nose, and throat.
To test this, when all instances of the word leukoplakia were replaced with the word plaque in search result 12, the FRES, FKGL, and SMOG changed from 25.9 to 33.1, 13.5 to 12.5, and 11.4 to 10.8, respectively. This change yielded fairly modest impact, but it does suggest that at least a portion of the poor readability scores may relate to length of medical terminology.
The PEMAT-P tool has some limitations. It was designed to allow a layperson to evaluate the quality of health literature, but it does not assess the scientific accuracy of specialist information. This article did not assess the accuracy of information in the websites, although clearly that is an important issue as well. Additionally, our protocol was limited to websites written in English and did not analyze videos. Websites that are written in different languages and video materials might be of different quality and readability but are beyond the scope of this study.
Conclusion
Websites on vocal cord leukoplakia are written at a level beyond what is recommended by the AMA and the NIH. None of the websites analyzed in this study met the recommendation of a sixth-grade reading level, and in aggregate these websites were above a 12th-grade reading level. Patient-targeted websites were written at a less advanced reading level than professional-targeted sites, but they were not significantly more understandable or actionable. Many patients go online to seek medical knowledge. Written materials are a valuable supplement to verbal communication, and information found online can reinforce topics discussed during face-to-face visits and improve overall understanding of a condition or proposed procedure. It is important that these online education materials be made more readable, understandable, and actionable to help direct appropriate patient-centered care.
