Abstract
Keywords
Introduction
The renal system is important for filtering blood, the regulation of arterial blood pressure, and the production of urine. Narrowing of the arteries that supply the kidney, or renal artery stenosis (RAS), can lead to blood pressure dysregulation (ie, hypertension) or renal failure. It has been documented that approximately 24% of adult patients with resistant hypertension also exhibit RAS. 1 Digital subtraction angiography (DSA) using fluoroscopy imaging is the recognized gold standard, but is also the most invasive testing method, requiring sedation, radiation, and radiographic contrast. 2 Other techniques, such as computed tomography angiography (CTA) or magnetic resonance angiography (MRA) provide accurate alternatives to DSA, 3 but use considerably more contrast (CTA) 4 and are dependent on the patient’s ability to be still for a prolonged examination period (MRA), respectively. Although an accurate method for identifying RAS, angiography may not be tolerated by all, is relatively costly, and requires specialized personnel and equipment. Therefore, noninvasive and less costly options to identify RAS are warranted.
Duplex ultrasound may be an alternative imaging modality for identifying narrowing of the renal arteries, with DUS parameters, such as peak systolic velocity measurements characteristically high in patients with RAS. 5 Duplex ultrasound could be used as a relatively affordable, accessible, noninvasive, and safer (ie, absence of ionizing radiation and contrast material) method to assess RAS severity. In some cases, DUS serves as a screening tool for RAS. 6 However, the evidence base for the efficacy of DUS to detect RAS is inconclusive. A review by Williams et al 7 summarizing all studies up to 2005 (most articles: 1980-1995) reported inconclusive, mixed evidence for the validity of DUS to detect RAS. Like any technology, DUS imaging has improved decade-to-decade, producing sharper images and increased portability, but our understanding of modern ultrasound in diagnosing RAS is limited. Accordingly, the mixed results from studies primarily conducted 20 to 40 years ago are based on inferior DUS technology, and existing machines provide superior images and thus, a potentially greater ability to detect RAS. In addition, study quality was not assessed in the previous review, 7 making it difficult to discern whether results are based on low-quality or high-quality evidence. An up-to-date systematic review that evaluates the evidence for modern DUS to predict RAS and provides quality assessments of the included research is needed to better inform imaging practitioners on the utility of DUS in assessing RAS.
The purpose of this systematic review was to determine whether DUS accurately detects RAS in comparison with angiography. Given the improvements in DUS technology and prior review of the evidence pre-2005, 7 this review included studies post-2005 only. This review summarized results across studies varying in the definition of RAS (ie, % stenosis), ultrasound parameters used, and comparator criteria that may complement future work aimed at identifying the ideal and most appropriate diagnostic parameters and stenosis thresholds. The outcome measures of interest to determine the agreement between DUS and the criterion measures were diagnostic sensitivity, specificity, and accuracy.
Methods
Search Strategy
The search strategy and systematic review procedures were preregistered in Open Science Framework (DOI: 10.17605/OSF.IO/SE9VN) prior to conducting the study. Literature searches were conducted using Scopus, EMBASE, MEDLINE, CINAHL, and Academic Search Premier databases up to November 10, 2022. Our search strategy is presented in Supplemental Table 1, including an example. Search strategies were developed using ultrasound and stenosis as primary search terms by researchers with prior experience conducting review studies.8,9 Following recommendations, 10 no search restrictions were placed on the type (ie, crossover study, randomized control trial, etc), language of the study, year of study, or population (eg, human studies) at the search stage of the study, but rather at the screening stage.
This review followed the preferred reporting for items for systematic reviews and meta-analysis (PRISMA) 2020 statement. 11 Article citations were downloaded to an online research management system (Mendeley, Elsevier, Amsterdam, The Netherlands) and duplicates were removed. Remaining references were exported to systematic review software for screening (Covidence, Melbourne, Australia).
Inclusion and Exclusion Criteria
Studies not published in a peer-reviewed journal, or published as an editorial, opinion, review, or conference abstract were excluded. Gray literature would not provide sufficient information to be fully included in the study, and therefore, these studies were excluded. No language or timeline restrictions were implemented into the search strategy, but only articles published post-2005 and available in English were extracted. All studies included human participants, with studies in both children (if available) and adults included.
To be included in the review, studies must compare the detection of RAS by DUS with DSA (gold-standard criterion) or any of the other reference criterion standards, such as MRA and CTA. The most definitive test for RAS is DSA, but MRA and CTA have been demonstrated to exhibit high diagnostic accuracy relative to DSA 3 and serve as acceptable reference standards. Results are presented separately for studies using DSA as a reference versus these other surrogate imaging techniques, with the most emphasis placed on studies employing DSA. All studies must conduct these measures in each participant and indicate the diagnostic accuracy of the ultrasound measurements (eg, accuracy, sensitivity, specificity, and/or kappa).
The accuracy of DUS relative to the criterion measure is characterized as the proportion of arteries that was correctly diagnosed with or without stenosis (accuracy). The proportions that were correctly identified as true positives (both methods detect RAS) or true negatives (both methods do not detect RAS) are referred to as sensitivity and specificity, respectively. Negative predictive value (NPV) is the likelihood that a person who has a negative test (ie, no stenosis identified) does not have RAS (true negative/[true negative + false negative]). Conversely, positive predictive value (PPV) is the likelihood that a person who has a positive test (ie, stenosis identified) does have RAS (true positive/[true positive + false positive]). For each statistic, heuristic thresholds of > 80%, 60% to 80%, and < 60% were considered high, moderate, and low agreement, respectively. 12 As outlined above, only studies post-2005 (≥ 2006) were included.
Screening Process, Data Extraction, and Bias Assessment
The titles and abstracts of citations were screened independently by 2 reviewers who identified potential articles for inclusion. The full text of apparently relevant articles was obtained and screened by the same 2 reviewers. If a consensus could not be reached, a third reviewer acted as the arbiter. The reference list of included articles was back searched for other potentially relevant articles. The information extracted included the location of study, number of participants and arteries, age group and participant characteristics (eg, sex), definition of stenosis used, criterion measure, and sonographer background (if presented).
Assessment of bias and applicability concerns was guided by the Quality Assessment of Diagnostic Studies 2 (QUADAS-2) tool that is recommended for use in systematic reviews to evaluate the risk of bias and applicability of primary diagnostic accuracy studies. 13 Articles were assessed at the study level, evaluating patient selection, index testing, reference testing, flow, and timing. A detailed description of how the QUADAS-2 is scored is presented elsewhere, 13 but in general, there are signaling questions for each risk of bias and applicability category for authors to consider. For example, risk of bias for the index test asks authors to consider whether results of the comparator test are interpreted without knowledge of the reference standard and whether a prespecified threshold was used for sensitivity/specificity. For flow and timing, a risk of bias may be elevated if there was a long interval between the comparator and reference standard measures, as well as the implementation of multiple reference standards. While decisions for study quality are systematic, they are inherently dependent upon reviewer decisions. Similar to the article screening process, quality assessment for each article was independently completed by 2 reviewers. A third reviewer was consulted to make a final decision in each instance of disagreement between the 2 reviewers.
Results
Study Characteristics
Our search included Scopus (n = 415), MEDLINE (n = 711), EMBASE (n = 370), CINAHL (n = 135), and Academic Search Premier (n = 118). After duplicates (n = 635) were removed, 1114 articles were screened. As presented in Figure 1, 34 articles met our inclusion criteria after full-text screening.

Flow-chart indicating the number of articles included or excluded at each stage of the screening process.
The included studies encompassed 2968 unique participants and reported a total of at least 1281 unique females (Table 1). The age of participants ranged from 9 to 95 years. Only 1 study specifically included patients with fibromuscular dysplasia and Takayasu’s arteritis. 14 Studies defined stenosis primarily as ≥ 50% (n = 19) or ≥ 60% (n = 11) artery narrowing as a diagnostic indicator of severe RAS, with some not reporting or using a graded scale (n = 4). A single reference measure was performed in most studies (n = 31), while multiple reference measures were performed in others (n = 3). The reference measures included DSA (n = 10), CTA (n = 4), contrast-enhanced CTA (n = 3), MRA (n = 5), contrast-enhanced MRA (n = 2), conventional cut-film angiography (n = 7), selective renal angiography (SRA; n = 4), and conventional angiography (CVA) with transstenotic pressure gradient measurement (n = 1). Doppler ultrasound was conducted both directly (ie, evaluation of the main renal artery; n = 14), indirectly (ie, evaluation of Doppler waveforms in renal parenchyma; n = 3) or using a combination of both (n = 17).
Characteristics of Included Studies Comparing Duplex Ultrasound and Angiography Determined Renal Artery Stenosis.
Note. DSA = digital subtraction angiography; PSV = peak systolic velocity; Sens = sensitivity; Spec = specificity; EDV = end diastolic velocity; RAR = renal aortic ratio; RRR = renal-renal ratio; PPV = positive predictive value; NPV = negative predictive value; TA = Takayasu arteritis; ACC = accuracy; NR = not reported; RAS = renal artery stenosis; CE-MRA = contrast enhanced magnetic resonance angiography; CVA = conventional angiography; SRA = selective renal angiography; CTA = computed tomography angiography; CE-CTA = contrast enhanced computed tomography angiography; MRA = magnetic resonance angiography; RI = resistance index; RSR = renal-segmental ratio; RIR = renal-interlobar ratio; FMD = fibromuscular dysplasia; PGM = transstenotic pressure gradient measurement.
Comparison With DSA
Ten of 34 studies compared RAS diagnoses using ultrasonography with DSA (DSA only: n = 9; DSA, MRA, and CTA: n = 1). The results from studies comparing DUS with DSA only are presented in this section. Two studies reported accuracy as their main outcome.15,16 While 1 study that defined stenosis as > 50% reported high-accuracy agreement (86.7 ± 3.8%), 15 another that defined stenosis severity as graded (ie, Grade I-Grade IV) reported moderate agreement (75%). 16 Six of 9 studies reported NPV and PPV, respectively.15-20 Five of 6 studies reported high agreement for NPV (range: 83%-100%).15,17-20 However, Cui et al. 16 using a graded definition of stenosis reported moderate NPV agreement (60%). Three of 6 studies reported high PPV (range: 87%-99%)16,18,19 and 1 study reported moderate PPV. 17 However, Tola et al 15 reported high accuracy during B-flow imaging and a renal aortic ratio of > 3.5, but a moderate PPV during a peak systolic velocity of > 200 cm/s. Saeed et al 20 reported moderate and weak PPV for agreement on a kidney versus patient basis, respectively. Nine of 9 studies report sensitivity and specificity as a main outcome.15-23 Five of 9 reported high sensitivity (range: 80%-98%),17-20,22 while 1 study reported moderate sensitivity (74%). 23 The study by Zhu et al 23 was conducted in a middle-aged sample on average and with an n = 32 only (Table 1). Tola et al 15 reported high sensitivity during B-flow imaging (88%) and peak systolic velocity > 200 cm/s, but moderate sensitivity during renal aortic ratio of > 3.5. In addition, AbuRahma et al 21 reported moderate sensitivity using a peak systolic velocity of 285 cm/s and renal aortic ratio of 3.7, but poor sensitivity using an end diastolic velocity of 65 cm/s in a large sample size.
Similarly, half of studies reporting specificity (n = 4/8) observed high agreement (84%-99%).17-19,21 Three of 9 studies reported moderate specificity16,20,23 and 2 of these had relatively smaller sample sizes (both, < 70).16,23 Finally, 2 studies reported mixed results.15,22 Specifically, Soares et al 22 reported high specificity (91%) under a renal aortic ratio of > 3.0, but moderate under a peak systolic velocity of > 200 cm/s. Tola et al 15 reported high specificity under B-flow imaging and renal aortic ratio of > 3.5, but moderate under a peak systolic velocity of > 200 cm/s. In general, studies using DSA as a criterion demonstrated moderate-high agreement with DUS.
Bias and Applicability Assessment for DSA
The risk of bias assessment and applicability concerns for each study are presented in Table 2 and Figure 2. Of the 9 studies to use DSA as a criterion only, all studies demonstrated low applicability concerns for patient selection, the index test, and reference standard.15-23 Only 1 study reported unclear risk for patient selection 18 and 2 studies reported unclear 18 or high 17 risk for bias for the reference standard. However, 3 studies demonstrated either unclear16,18 or high 17 risk of bias for the index test. Four of 9 studies reported unclear15-17 or high risk 22 for flow and timing of data collection as the time interval between the DUS and DSA measurements was not reported.
Risk of Bias Assessment and Applicability Concerns of Each Included Study Using the QUADAS-2 Tool.
Note. L = low risk; H = high risk; U = unknown risk.

Graphical display of the risk of bias and applicability concerns of included studies using the QUADAS-2 tool. (A) Risk of bias. (B) Applicability concerns.
DUS versus Other Angiography Techniques
Twelve of 25 studies reported accuracy as a main outcome.12,14,24-31,32,33 Six of those 12 studies reported high agreement with CVA,14,30,33 SRA,25,29 and MRA 24 (range: 81-98%). The remaining 6 of 12 studies reported either moderate, 28 low, 31 or a combination of the degree of agreement depending on the DUS parameter.12,27,32 These studies were generally conducted using similar criterion measures (eg, MRA, SRA, and CVA). Most studies reported sensitivity (n = 24/25)3,5,12,14,24-29,30-39,40-43 and specificity (n = 24/25).3,5,12,14,24-29,30-39,40,41,43,44 Nine of 24 studies reported exclusively high sensitivity (range: 90%-100%)29,33,34,37,38,40,41,45 and 13 of 25 reported high specificity (range: 81%-100%).3,14,24,29-31,33,34,37,38,40,41,45 Six of 25 reported moderate (range: 67%-75%)3,24,42 or low (range: 20%-57%)28,31,39 sensitivity and 2 of 25 reported moderate (67%) 39 or low (6%) 44 specificity. Of note, Lo and Donaldson 44 reported the lowest specificity (6%) for DUS but only examined 14 participants (16 arteries).
In some cases, studies exhibited mixed results for sensitivity and specificity due to different conditions (eg, peak systolic velocity, renal aortic ratio, etc).5,12,14,27,30,32,35,36 Turgutalp et al 12 demonstrated that DUS exhibited high sensitivity (83%) in patients who are younger than 60 years but only moderate sensitivity (69%) in those older than 60 years. In addition, Zachrisson et al 43 exhibited decreasing sensitivity (range: 67%-91%) and increasing specificity (range: 42%-91%) as peak systolic velocity increases. These factors may play a critical role in the agreement between DUS and angiography.
Eighteen studies reported PPV and/or NPV.3,5,12,24-30,31-35,39-41 Five of 18 studies reported high PPV (range: 83%-100%)24,29,30,33,45 and 9 of 18 studies reported high NPV (range: 83%-97%).3,24,26,29,30,33-35,41 Six of 18 studies reported moderate (range: 60%-76%)3,28,31,41 and low PPV (range: 55%-57%)34,39 and 2 of 18 reported moderate (67%) 39 and low (50%) NPV. 31 Five of 18 studies reported mixed findings.12,26,28,32,35 Interestingly, Chi et al 26 observed high PPV and NPV in patients with 50%-69% stenosis but low PPV and high NPV in those with ≥ 70% unstented stenosis. Despite this, DUS exhibited high PPV and NPV for differentiating angiographic 50%-60% and ≥ 70% stenosis in patients with renal stents.
Bias Applicability Assessment for Other Angiography Techniques
Similar to studies that investigated DUS compared with DSA, all studies (n = 25/25) reported low applicability concerns for patient selection, the index test, and the reference standard. Furthermore, 2 of 25 reported unclear5,29 risk of bias for patient selection due to ambiguous recruiting methods. Five of 25 studies reported unclear28,33,34,42,44 and 3 of 25 exhibited high12,26,39 risk bias for the index test due to inconsistent delivery of DUS to all included participants. For example, Jazi et al 39 used renal angiography as their reference standard, only conducted DUS in 16 of 37 participants while the other 21 of 37 underwent MRA. Five of 25 studies demonstrated unclear28,33,41,42,44 and 4 of 25 demonstrated high risk of bias for the implemented reference standard. This was due to either multiple reference standards being implemented (eg, MRA and SRA) 12 or unspecified rationale for the selection of angiography type. 28 Finally, 15 of 25 and 2 of 25 studies exhibited unclear3,5,12,14,27-29,31,33,34,36,37,39,41,44,45 and high26,43 flow and timing risk, respectively, as the time interval between the DUS and criterion measurements was inappropriate or not reported.
Discussion
Our review amalgamated the most recent available literature investigating the diagnostic accuracy of DUS to predict RAS in comparison with angiography. In comparison with the gold-standard assessment of fluoroscopy DSA, most studies exhibited moderate-high agreement with DUS based on literature with a low risk of bias and low applicability concerns. Compared with angiographic methods of detecting RAS, DUS generally exhibited moderate-high agreement. However, it should be recognized that the ultrasound outcomes (eg, peak systolic velocity) and stenosis thresholds used to denote RAS (eg, > 50%) were heterogeneously implemented across studies. Altogether, our review of more modern DUS studies supports the utility of using ultrasound parameters in diagnosing RAS based on high-quality evidence and draws attention to methodological aspects that require further consensus in this field of research to better inform imaging practices.
RAS due to development of atherosclerosis in the arteries that supply oxygenated blood to the kidneys is implicated in the development of renal failure. 46 The studies included in our review indicate that DUS might be a reasonable alternative to DSA, with most studies demonstrating moderate-high rates of true positives and true negatives. Therefore, DUS may be useful as a first-line option, and if inconclusive DUS results are observed, then the more expensive and invasive angiography techniques could be used. Our reviewed evidence is somewhat inconsistent with the 88 articles included in the review by Williams et al 7 up to 2005 that documented mixed results. The 34 articles included herein post-2005 primarily observed acceptable agreement between DUS and angiography. Accordingly, the more recent studies examining the validity of DUS more consistently document better agreement that is likely due to the improvements in ultrasound technology or possibly due to better training or formal education on ultrasound practice. Most of their studies were conducted in the 80s and 90s, and subsequently the field of sonography has made strides in accreditation and training practices. 47 While the ultrasound machine, software used, and training level of the sonographer are not typically reported, improvements in these factors likely explain our more favorable outcomes. Of note, our review documented study quality and observed low concerns of bias and applicability, which aids in the external application of the results. More modern sonographers who are equipped with better ultrasound machines can identify RAS at a level that supports DUS as an acceptable alternative to angiography.
Despite the original mixed supporting evidence base, 7 DUS of the renal arteries has been indicated as a screening procedure for RAS. 48 The results of our review support this practice and position DUS as an additional “tool” for sonographers and radiologists to use as it fits within their clinical practice. Of note, RAS may also be prevalent among children with unique conditions (eg, fibromuscular dysplasia), 49 and it is unclear whether our findings are applicable to children with smaller arteries. Furthermore, there is a need to better understand which DUS parameter exhibits the best accuracy and adopt standard procedures when conducting diagnostic validity studies as it relates to reporting specific metrics (eg, peak systolic velocity threshold, number of females, direct/indirect, sonographer training, etc) and the time between comparator and criterion outcomes. A well-defined consensus or diagnostic rationale as to what percentage of narrowing defines stenosis is also greatly needed, with studies employing 50%, 60%, or graded categories. Despite this, results were relatively similar between studies indicating that while a stenosis threshold of 50% versus 60% may not largely impact diagnostic accuracy, for example, there are practical implications of employing standard procedures that would better facilitate this measurement into practice and homogenous methods for evaluating literature (eg, meta-analyses). Future studies quantifying the influence of these factors to impact results and determine the cost-benefit 50 of routinely implementing DUS screens would be important next steps in determining the utility of DUS as a regular diagnostic tool for RAS.
While our study adds to the literature by establishing the diagnostic accuracy of DUS in detecting RAS, review studies are inherently limited by the heterogeneity of the studies included. Specifically, the criterion used to evaluate the diagnostic accuracy of DUS varied between studies, with studies employing DSA, CTA, MRA, or some combination thereof. To address this, we separated outcomes by DSA (gold-standard criterion) versus other angiography criteria but observed relatively consistent results of moderate-high agreement. Our study draws attention to the unique DUS parameters investigated, which prevented the quantification of amalgamated results for a meta-analysis given that the thresholds for peak systolic velocity or renal aortic ratio were variable. Reporting the timing between comparator and criterion measures is encouraged, as this was the largest source of potential bias identified by the QUADAS-2 tool, leading to most studies being classified as “unclear.” However, concerns over other risks of bias and applicability were generally quite low (eg, reference standard, patient selection, etc), as presented in Figure 2.
It cannot be directly confirmed that older ultrasound machines were not used or any evidence to support or refute a 2005 cut-off for study inclusion. Indirectly, the divergent results between the previous review up to 2005 and our review of 3220 patients supports our rationale for using this cut-off. Accordingly, the high number of participants and studies included (n = 34), provide strong evidence to answer our research question. Our inclusion criteria included all ages, but few studies were conducted in children or youth, preventing any strong conclusions in this population who may benefit most from a less invasive more portable imaging modality. 49 While this study reviews the diagnostic evidence, there may be additional practical considerations prior to implementing DUS as a diagnostic tool for RAS in a clinical setting, such as the availability of staff, equipment, training, and patient factors.
In conclusion, DUS exhibits moderate-high agreement with gold-standard and other well-established angiographic criteria with the existing literature having low risk of bias and low patient applicability. Some of the limitations of DSA may be addressed with DUS and might be a suitable alternative imaging technique for evaluating RAS that practitioners should consider. There is a clear need to adopt a standardized methodological procedure for evaluating the ability of DUS to diagnose RAS.
Supplemental Material
sj-docx-1-jvu-10.1177_15443167231223551 – Supplemental material for Accuracy of Duplex Ultrasound for Detecting Renal Artery Stenosis: A Systematic Review
Supplemental material, sj-docx-1-jvu-10.1177_15443167231223551 for Accuracy of Duplex Ultrasound for Detecting Renal Artery Stenosis: A Systematic Review by Jessica R. MacLeod, Matthew J. Kivell, Madeline E. Shivgulam, Haoxuan Liu and Myles W. O’Brien in Journal for Vascular Ultrasound
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: HL and MES were supported by a Nova Scotia Graduate Scholarship. MES was supported by a Heat & Stroke BrightRed Scholarship. MWO was supported by a CIHR Post-Doctoral Fellowship Award (#181747) and a Dalhousie University Department of Medicine University Internal Medicine Research Foundation Research Fellowship Award.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
