Abstract
Since 2005, the American Board of Vascular Medicine (ABVM) endovascular examination has been used to certify vascular practitioners. Annual rigorous review has confirmed it is psychometrically valid and reliable. However, the evidence basis underlying the examination items has not been studied systematically. The aim of this study was to adjudicate class of recommendation (COR) and level of evidence (LOE) for the 2015 ABVM endovascular examination and establish an additional feedback mechanism for examination improvement based on contemporary evidence-based guidelines. We performed a pooled consensus process to classify each of the 110 items in the 2015 ABVM endovascular examination by COR and LOE as detailed in the current guideline statements. We added additional categories for items that were not eligible for assignment using traditional current evidence-based metrics: ‘COR X’, cannot be determined, not applicable, or simple recognition; and ‘LOE X’, cannot be determined or not applicable. COR classifications were assigned in the following proportion: Class I=15%, Class II=40%, Class III=3%, COR X=42%. LOE classifications were assigned in the following proportion: Level A=12%, Level B=34%, Level C=32%, LOE X=22%. Our analysis showed that nearly half of the 2015 ABVM endovascular examination items were supported by strong scientific evidence or fact-based knowledge. COR and LOE analysis yielded notably different results. Use of alternate classification schema may be powerful tools for improving certification exams in healthcare.
Introduction
The exponential expansion of research and knowledge in the diagnosis and treatment of patients with arterial, venous, and lymphatic disease created the need for systematic training and credentialing for vascular medicine practitioners. Members of the Society for Vascular Medicine (known at the time as the Society for Vascular Medicine and Biology or SVMB), American College of Cardiology, and the Society for Catheterization, Angiography, and Intervention recognized this unmet need and established the American Board of Vascular Medicine (ABVM) in 2005 to certify vascular practitioners. The ABVM created two pathways to evaluate physicians with either formal training or extensive practice experience in noninvasive vascular medicine (the general examination) or interventional vascular medicine (the endovascular examination). 1 The primary objective of creating the ABVM endovascular certification examination was to provide an accurate assessment of knowledge needed to safely and appropriately perform endovascular intervention. The examination has met or exceeded psychometric validity analysis and thus reflects a practitioner’s ability to practice competently. 2 Over 1000 examinees have passed the endovascular exam since it was first offered in 2005.
Generally, physician healthcare certification examination design begins with construction of an examination content outline (ECO), which consists of various tasks and/or knowledge areas that are organized into content domains that can be used to structure a balanced exam. The ECO domains are often weighted by experts based on importance and relevance to independent practice for the competency being evaluated. The number of items required for each content domain and associated constituent task or knowledge is used to ensure that a valid representation of content is tested and assesses the typical practice paradigms the examinee may encounter. 3 In some cases, classification schemes are utilized, including cognitive level (often modeled after Bloom’s Taxonomy), 4 body system (often seen in physician examinations and used to ensure no over- or underrepresentation of specific bodily functions or anatomical structures occurs), disease state or condition (similar to the previous), and task or knowledge if not already addressed within the ECO. Analysis of the quality of board examinations centers around validity (the degree to which the examination content is reflective of job tasks and professional competencies), reliability (the consistency and replicability of examinee response patterns), and decision outcomes (e.g. pass/fail rates according to region, demographics, or institution). 5
Evidence-based medicine has many definitions but for the purpose of this study is considered as assimilating clinical trial results and applying these findings to clinical decision making. 6 A hierarchical system has been devised to categorize evidence by class (strength) of recommendation and level (quality) of evidence. 7 Ideally, providing guideline-based care translates into better clinical care and in theory should be a critical component of certification exams, particularly in the interventional subspecialties. 8 However, there is an extensive foundation of knowledge needed to practice independently in any field and it is expected that many questions needed to assess competency may not fit neatly into an existing classification scheme. For example, some ‘fact-based’ questions focus on clearly defined, established facts such as anatomic definitions, angiographic findings, a biochemical reaction, patency, or complication rates.
In an effort to better understand the composition of the 2015 ABVM endovascular examination, we evaluated the class of recommendation (COR) and level of evidence (LOE) for each exam item with the goals of: (1) better understanding what percentage of the exam questions were addressed by multidisciplinary guidelines or strong evidence-based data and (2) stimulating thought on if and how evidence-based metrics can be used to evaluate competency exams with the proposed model or a modification thereof.
Methods
For the ABVM endovascular examination, the ECO delineates the high-level job tasks (domain), knowledge areas (subdomain), and, when applicable, disease state or condition (content area) in a hierarchical list (Table 1). The 2015 ABVM endovascular examination consisted of 110 scored items. Item writers were asked to prepare questions in one of several ECO areas with answers supported by evidence from the literature when available. Occasionally, the ‘pattern recognition’ needed to answer a question was not used as a gateway to an evidence-based answer, but rather to test the ability to recognize an important finding such as an anatomic variant or angiographic complication. The latter are unsuitable for evidence-based analysis. In addition, other items could involve conditions that are less common and not amenable to evidence-based trials. These latter items can be adjudicated based upon validity and reliability, but not on COR and LOE.
Examination content outline (ECO).
Three subject matter experts who serve on the Board of Directors for the ABVM and a psychometrician classified each of the scored items from the 2015 examination. The overall project, as well as authorization for access to the 2015 examination items, was approved by the ABVM Board. The group convened by webinar four times between January 2016 and March 2016. A pooled consensus process was used to classify each item. 9 There was unanimous agreement on each item by the end of the process. Items were categorized by COR and LOE. The overall process for classifying each item is shown in Figure 1.

Approach to classifying examination items. An item was deemed ineligible if either class of recommendation (COR) could not be assigned as I, II, or III, OR if level of evidence (LOE) could not be assigned as A, B, or C. For some items, both class of recommendation AND level of evidence could not be assigned (X).
Class of recommendation was defined as COR I (strong recommendation); COR II (moderate to weak recommendation); COR III (no benefit, or harm). We created an additional category called COR X for items where classification could not be determined, is not applicable, or requires recognition of a clinical condition.
Items were further categorized by LOE. LOE A indicates multiple populations evaluated, with data derived from multiple randomized clinical trials (RCTs) or meta-analysis. LOE B indicates limited populations evaluated, with data derived from a single RCT or nonrandomized studies. LOE C indicates very limited populations evaluated, with data coming from the consensus opinion of experts, case studies, or standard of care. We created an additional category to label certain items that did not readily fall within this scheme: level of evidence X. LOE X indicates that the LOE cannot be determined or is not applicable, which most often represents information that is supported by the least scientific evidence or applies to knowledge of anatomic variations, image recognition, or pathognomonic clinical findings.
When an examination item assessed knowledge supported by published guidelines (e.g. CHEST guidelines for anticoagulation in a patient with isolated popliteal deep venous thrombosis), the authors utilized the COR and LOE endorsed by the society or working group. 10 If new information was available to upgrade the LOE, the higher LOE was assigned to the item.
We provide two example questions that were classified as COR X, LOE X:
A 52-year-old woman has a hypoplastic aorta and associated chronic occlusion extending from just below the renal arteries to the distal external iliac arteries. The patient experiences rest pain in the left leg immediately after undergoing coronary artery bypass surgery. There is no discoloration of the leg, but she has diminished pulses on Doppler ultrasound. What is the most likely cause of the patient’s ischemia? Heparin-induced thrombocytopenia Compromised collateral vessels from vein harvesting Embolic event in the lumbar collateral veins
Use of the internal mammary artery
A 61-year-old patient underwent catheter-directed thrombolysis for an occluded prosthetic femoropopliteal bypass of the left leg from the contralateral femoral artery. Successful lysis was achieved after 16 hours. Lytic therapy was stopped and anticoagulation was continued. Twenty-four hours later, the patient reported having leg pain for approximately 1 hour and the inability to lift the left leg off the bed. What is the most likely cause of the patient’s symptoms?
Psoas muscle hematoma
Spinal cord compression from a hematoma Intracranial hemorrhage Rethrombosis of the graft
Results
There were four sections on the examination, each with different weighting: pathophysiology, risk factors, and screening (8.2%); diagnosis and testing (16.4%); treatment and interventions (63.6%); and prognosis, surveillance, and re-intervention (11.8%) (Table 1). The specific content for the examination reflects the scope of practice for a well-trained endovascular specialist and was divided into arterial diseases (64%), venous diseases (23%), atypical vascular disorders (10%), and lymphatic diseases (3%).
A total of 110 items were evaluated for COR and LOE (Supplemental Tables 1 and 2). The distribution of COR is shown in Figure 2. COR I, II, or III was seen in 58.2% and COR X in 41.8%. In the four content areas, the number of COR I, II, or III items was 33.0% in ‘pathophysiology, risk factors and screening’, 21.1% in ‘diagnosis and testing’, 53.8% in ‘prognosis, surveillance, and re-intervention’ and 72.8% in ‘treatment and interventions’.

Class of recommendation (COR). Stacked column bar graph shows the percentage of items classified as COR I, COR II, COR III, and COR X for each of the four major topic areas on the 2015 ABVM endovascular examination: pathophysiology, risk factors and screening; diagnosis and testing; treatment and interventions; and prognosis, surveillance and re-intervention.
The distribution of LOE is shown in Figure 3. LOE was categorized as A, B, or C in 78.2% of items, with the remaining 21.8% not eligible for categorization (i.e. LOE X). In the four content areas, the number of items adjudicated as level A, B, or C was 77.8% in ‘pathophysiology, risk factors and screening’, 50.0% in ‘diagnosis and testing’, 92.3% in ‘prognosis, surveillance, and re-intervention’ and 82.9% in ‘treatment and interventions’.

Level of evidence (LOE). Stacked column bar graph shows the percentage of items classified as LOE A, LOE B, LOE C, and LOE X for each of the four major topic areas on the 2015 ABVM endovascular examination: pathophysiology, risk factors and screening; diagnosis and testing; treatment and interventions; and prognosis, surveillance and re-intervention.
Discussion
The desire for physician certification (primary and specialty) is seen at all levels: physicians, hospitals, payers, and patients. Hospitals and third-party payers want practitioners who meet the highest standards and provide value; patients find reassurance in going to a physician who is certified11,12; and physicians want to demonstrate that they have achieved competence by passing a rigorous examination.13,14
The endovascular examination item bank has been populated by experts in the field with hundreds of items that have undergone adjudication by a committee representing the diversity within the practitioner population. In 2013, the ABVM endovascular examination underwent rigorous evaluation and revision that enabled it to accurately reflect the scope of endovascular practice, and allow for the development of a fair, accurate, and realistic assessment of each candidate’s readiness for certification. 2 Each examination item exhibited acceptable classical-test-theory performance characteristics (p-value, point-biserial correlation) and the overall exam exhibited acceptable scale-level performance characteristics (Cronbach’s alpha reliability coefficient, Livingston decision consistency index), all of which are common and generally accepted performance indicators in the field of professional credentialing.
Although these statistical measures assure that the examination meets psychometric standards, they say little about the evidence basis underlying each item. In theory, the ‘best’ examination would be comprised entirely of items supported by COR I, LOE A data. However, vascular intervention involves such a heterogeneous group of patients with arterial, venous, and lymphatic disease that many subgroups (e.g. subclavian stenting, mesenteric stenting, etc.) do not occur with sufficient frequency to warrant randomized trials. In addition, there are important domains in each field that must be tested and are not conducive to rigorous clinical trials; however, the answers are scientifically grounded.
In the present study, we found that 58% of items were eligible for categorization as COR I, II, or III, and 78% of items were eligible for categorization as LOE A, B, or C. Thus, the majority of items pertaining to clinical strategies, interventions, treatments, or diagnostic testing were supported by strong evidence in the literature.
Since this is the first article to examine the evidence basis underlying board examination items, there are no benchmarks for a comparison of these findings. The results of the COR and LOE assignments yielded different distribution shapes, with COR showing no discernable pattern and LOE resembling a ‘normal curve’. The authors theorize that the difference in distribution may reflect a difference in each measure’s level of fidelity with regard to practice. In other words, COR is less easily quantified due to potential variability and myriad nuances involved with clinical decision-making and expert judgment, while LOE may be more suited to the classification on physician certification exams because it characterizes the evidence required to perform a job task. Further exploration and research is needed to support or refute this explanation.
The heterogeneity of examination item types makes evidence-based analysis challenging. The hierarchy of evidence from case reports to randomized trials is most relevant when evaluating different therapeutic options, quality improvement, cost effectiveness, and preventive measures. However, items pertaining to prognosis and etiology/harm may be best answered by cohort or case–control studies. In addition, there are items that require fact-based knowledge including anatomic recognition, pathognomonic clinical findings, and image recognition that meet the same academic value of ‘correctness’ as COR I evidence. Finally, items regarding diagnostic testing are often based on sensitivity and specificity data, while items on devices may require knowledge of instructions for use, and selecting appropriate imaging studies may be based on imaging guidelines, all of which may be driven by different LOE models.
Limitations
We acknowledge certain limitations with our analysis. We categorized questions using a pooled consensus approach, which carries potential shortcomings of influence from a dominant individual, noise, and group pressure for conformity. 15 We believed that a pooled approach was reasonable given the small group size and its history of working together in a collegial fashion on the ABVM Board. To our knowledge, this represents the first article to categorize the evidence underlying a board certification examination. We cannot compare the structure of the ABVM examination with exams from other certification bodies (i.e., American Board of Internal Medicine). Thus, we cannot state that the percentage of questions eligible for categorization is good or bad. However, this article provides a benchmark against which future exams may be compared. We believe the ABVM endovascular exam is a fair measure of an examinee’s competence, and we were pleased to see the high percentage of questions that had a strong evidence-based foundation.
With the evolution and growth within the discipline of vascular and endovascular medicine, there has been increased emphasis on developing quality care metrics through comparative effectiveness, cost effectiveness, and evidence-based decision-making. 16 Accordingly, this study provides a unique look at the intersection between the rapidly changing fields of vascular intervention, medical knowledge competency testing, and evidence-based medicine. These measures may provide insight into how the body of knowledge is expanding and allow for development of exam quality–improvement metrics while at the same time provide confidence to interested parties who require physician certification. We believe this study sets a benchmark against which subsequent ABVM endovascular examinations can be measured, and may serve as a catalyst for other groups to evaluate other classification schemes, such as COR and LOE, on their certification examinations.
Conclusion
In conclusion, our analysis showed that more than half of the 2015 ABVM endovascular examination items were supported by strong scientific evidence or fact-based knowledge. COR and LOE yielded notably different results. Use of alternate classification schema, such as the ones explored herein, may be powerful tools for improving future certification exams in healthcare, but additional studies and model refinement are needed.
Footnotes
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The authors report that Amin Saiar works for PSI Services LLC, which provides testing services to the American Board of Vascular Medicine.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
