Abstract
Non-alcoholic fatty liver disease (NAFLD) affects an estimated one-quarter of the global adult population and has become one of the leading causes of end-stage liver disease and hepatocellular carcinoma with increased liver-related and overall morbidity and mortality. The new term, metabolic dysfunction–associated fatty liver disease (MAFLD), has a set of positive diagnostic criteria and has been shown to have better clinical utility, but it has yet to be universally adopted. This review addresses the non-invasive tests for MAFLD and is based mostly on studies on NAFLD patients, as the MAFLD term is relatively new and there are limited studies on non-invasive tests based on this new term, while a large body of research work on non-invasive tests has accumulated in the literature for NAFLD. This review focuses on blood-based biomarkers and scores for the assessment of hepatic steatosis, non-alcoholic steatohepatitis (NASH), and fibrosis, and two of the most widely studied imaging biomarkers, namely vibration-controlled transient elastography and magnetic resonance imaging. Fibrotic NASH has become a diagnostic target of interest and novel serum biomarkers and scores incorporating imaging biomarker for diagnosis of fibrotic NASH are emerging. Nonetheless, the degree of liver fibrosis remains the key predictor of liver-related morbidity and mortality in patients with MAFLD. A multitude of non-invasive biomarkers and scores have been studied for the detection of liver fibrosis, including use of sequential non-invasive tests for risk stratification of advanced liver fibrosis. In addition, this review will explore the utility of the non-invasive tests for prognostication and for monitoring of treatment response.
Keywords
Introduction
Non-alcoholic fatty liver disease (NAFLD) affects an estimated 25% of the global population and is increasingly contributing to end-stage liver disease and hepatocellular carcinoma.1,2 It encompasses a spectrum of liver conditions characterized by excessive accumulation of fat in the liver and is diagnosed following exclusion of significant alcohol intake and other causes of chronic liver disease. In the majority of cases, it is due to overnutrition and is closely associated with metabolic risk factors. 3 Despite being the most common cause of chronic liver disease, its name tells us what it is not instead of what it is and its definition is not in line with the reality that more than one cause of chronic liver disease can be present in an individual patient. Because of these fundamental issues, the term metabolic dysfunction–associated fatty liver disease (MAFLD) has been proposed, which is defined as hepatic steatosis in the presence of overweight or obesity, type 2 diabetes mellitus or greater than two metabolic risk abnormalities.4,5 The new name and its definition bring better clarity to what the disease is and has been shown to have better clinical utility. However, the new term has not been universally adopted due to concern about impact on disease awareness, previous research work based on the term NAFLD and on-going clinical trials for non-alcoholic steatohepatitis (NASH). 6 In this review on non-invasive assessment of MAFLD, studies that were included were based on the term NAFLD unless stated otherwise. We believe that studies on non-invasive tests for NAFLD would show similar results when the MAFLD definition were to be applied because majority of patients who fulfilled the criteria for NAFLD would fulfill the criteria for MAFLD. This is especially so for studies on non-invasive tests for NAFLD where liver biopsy was used as the reference standard as these studies would usually include NAFLD patients with more severe liver disease who are more likely to also fulfill the criteria for MAFLD. However, this can only be proven when the data from these studies are re-analyzed using the MAFLD definition, an exercise that can and should be carried out when global consensus on the new term has been reached. In addition, MAFLD patients with other causes of chronic liver disease, which would not have been considered in previous studies on non-invasive tests for NAFLD, represent a distinct and heterogeneous group and should be looked at separately. With this background, we will address the non-invasive tests for MAFLD, focusing on blood-based biomarkers and scores and two of the most widely studied imaging biomarkers, namely vibration-controlled transient elastography and magnetic resonance imaging (MRI), for the assessment of steatosis, NASH, and fibrosis as well as for prognostication and monitoring of treatment response.
Non-invasive assessment of hepatic steatosis
Over the past two decades, several simple scores based on blood biomarkers have been proposed (see Table 1). The SteatoTest by Poynard et al. 7 in 2005 was one of the first non-invasive scores developed and validated for the diagnosis of hepatic steatosis and was based on liver histology of patients with chronic liver disease of various etiologies. Histologically, fatty liver disease is defined by the presence of steatosis in at least 5% of liver cells. The SteatoTest, which is a patented test, incorporated several blood-based biomarkers in addition to age, sex, and body mass index (BMI), and it had fair to good accuracy for the diagnosis of hepatic steatosis with an area under receiver operating characteristic curve (AUROC) of 0.72–0.86 using dual cut-offs of 0.3 and 0.7. Subsequently, Poynard et al. 8 reported that a modified version of the original SteatoTest, which used serum aspartate aminotransferase (AST) level instead of BMI and serum total bilirubin level, had less variability and was not inferior to the original SteatoTest. Other scores, namely the Fatty Liver Index (FLI) and Hepatic Steatosis Index (HSI) were developed using abdominal ultrasonography as the reference standard for the diagnosis of fatty liver disease. Both the FLI and HSI had good diagnostic accuracy with AUROC of 0.84 and 0.82, respectively.9,10 and used routinely available clinical and laboratory indices. Subsequently, Zelber-Sagi et al. 11 reported good agreement of the FLI (using dual cut-off values of < 30 and ⩾ 60) with SteatoTest. However, a recent meta-analysis on the diagnostic performance of the FLI by Castellana et al. 12 concluded that it could stratify risk of fatty liver disease but was not reliable as a tool to diagnose or exclude hepatic steatosis.
Performance of blood-based biomarkers or scores for hepatic steatosis.
AUROC, area under the receiver operating characteristic curve; FLI, fatty liver index; HSI, hepatic steatosis index; NA, not available; NAFLD, non-alcoholic fatty liver disease; NPV, negative predictive value; PPV, positive predictive value; VAI, visceral adiposity index.
Lind et al. 15 performed a comparison study of four steatosis detection scores, including the FLI, HSI, NAFLD liver fat score (LFS), and lipid accumulation product (LAP), using MRI–proton density fat fraction (MRI-PDFF) as the reference standard for the diagnosis of fatty liver disease. The authors concluded that the FLI could be a favorable scoring for population-based setting (NAFLD prevalence 23%; AUROC 0.82) while the NAFLD-LFS performed well in the high-risk setting (NAFLD prevalence 73%; AUROC 0.80). However, the NAFLD-LFS requires fasting serum insulin, which is not routinely available, hence limiting its clinical utility. Visceral adiposity index (VAI), which uses BMI, waist circumference, and serum triglyceride and high-density lipoprotein (HDL) cholesterol levels, is a surrogate biomarker of visceral adiposity, 16 and a recent systematic review and meta-analysis by Ismaiel et al. 17 concluded that VAI has fair accuracy in diagnosing adult NAFLD with an AUROC of 0.77. In a head-to-head comparison with FLI, HSI, and NAFLD-LFS, using liver biopsy as reference standard, VAI performed best in detecting the presence of hepatic steatosis with an AUROC of 0.92. 13 However, none of the steatosis biomarkers were able to demonstrate reasonable ability to quantify steatosis and distinguish between moderate and severe steatosis, which was attributed to confounding by the degree of inflammation and fibrosis. Using a machine learning model and proton magnetic resonance spectroscopy (¹H-MRS) as the reference standard, Yip et al. developed a new scoring system based on laboratory parameters (called the NAFLD Ridge Score) to exclude fatty liver disease at the population level. The NAFLD Ridge Score achieved 87% overall accuracy with AUROC of 0.87–0.88. 14 Overall, blood-based scores that use simple, inexpensive, and readily available parameters serve an important role in epidemiological studies where imaging tests would be more difficult to carry out.
Ultrasonography is the most commonly used imaging test for the diagnosis of hepatic steatosis and has been shown to be reliable and accurate for the diagnosis of moderate to severe hepatic steatosis with AUROC of 0.93. 18 However, it is operator dependent and is less accurate for mild hepatic steatosis. 19 Controlled attenuation parameter (CAP) is derived from the same radio-frequency data used for liver stiffness measurement during examination using Fibroscan, a vibration-controlled transient elastography device, and has been shown to correlate with hepatic steatosis. 20 It has been shown to be excellent for the diagnosis of hepatic steatosis, but its performance is affected by BMI and is less accurate to diagnose the different grades of hepatic steatosis. 21 An individual patient data meta-analysis has defined the CAP cut-offs for the diagnosis of hepatic steatosis ⩾ grade 1, ⩾ grade 2, and grade 3, which are 248, 268, and 280 dB/m, respectively. 22 However, CAP can be affected by BMI, etiology, and the presence of T2DM, and these should be considered when interpreting the results. The XL probe has been introduced to reduce failed and unreliable transient elastography examination results when using the M probe, especially in obese patients. 23 Studies suggested that the same cut-offs can be used for both the M probe and the XL probe for the diagnosis of hepatic steatosis.24,25 A subsequent individual patient data meta-analysis concluded that CAP obtained using the XL probe was good for the diagnosis of hepatic steatosis, especially in patients with viral hepatitis, but the optimal cut-offs varied according to etiology; however, CAP was not able to adequately grade steatosis in NAFLD patients. 26 Another vibration-controlled transient elastography device, the FibroTouch, also provides measurement of attenuation parameter, called the ultrasound attenuation parameter (UAP), which uses 244 dB/m as the cut-off for the diagnosis of hepatic steatosis. 27 In a study on patients with chronic liver disease of various etiologies, the UAP has been shown to correlate strongly with CAP. 28 The main advantage of attenuation parameter for diagnosis of hepatic steatosis is that the result is available simultaneously with liver stiffness measurement and can help diagnose hepatic steatosis in patients with other causes of chronic liver disease undergoing liver stiffness measurement. It is important to identify and manage hepatic steatosis and metabolic risk factors to improve overall outcome of patients with other causes of chronic liver disease. MRI-based techniques, such as ¹H-MRS and MRI-PDFF are highly accurate for the diagnosis of hepatic steatosis.29,30 MRI-PDFF has been shown to more accurately classify steatosis than CAP. 31 However, the use of MRI-based techniques in routine clinical practice is limited due to cost and availability, and it is usually reserved for clinical trial (see the section ‘Non-invasive tests for monitoring of treatment response’).
Non-invasive assessment for NASH
NASH is the more severe form of NAFLD defined histologically by the presence of lobular inflammation and hepatocyte ballooning in addition to hepatic steatosis that may lead to fibrosis and cirrhosis. Patients with NASH develop fibrosis progression at a faster rate and have higher overall mortality rate compared with patients with simple steatosis.1,32 Among the many serum biomarkers for NASH, the most widely studied is cytokeratin-18 fragments (CK18), which is derived from hepatocyte apoptosis. In the initial studies that included none or only a small proportion of patients with borderline NASH, CK18 was found to have good to excellent accuracy for the diagnosis of NASH with an AUROC of 0.83–0.93.33,34 However, subsequent studies that included larger proportion of patients with borderline NASH found CK18 to have poor accuracy for the diagnosis of NASH with an AUROC of 0.59–0.66,35–37 and not better than routinely available biomarkers, such as serum AST level. 36 A subsequent meta-analysis evaluating diagnostic value of CK18, which included 25 studies, showed that the pooled AUROCs of the CK18 fragments M30 and M65 for the diagnosis of NASH were 0.82 and 0.80, respectively; however, the criteria for NASH diagnosis in the studies included in the meta-analysis were not clearly described. 38 Although serum AST level was only fair for diagnosis of NASH with an AUROC of 0.75 in one study, 36 elevated serum AST level, especially when more than twice the upper limit of normal, has excellent positive predictive value for NASH. However, normal serum AST level has very poor negative predictive value for NASH. 39 The acNASH index combines serum AST and creatinine levels for the diagnosis of NASH and was found to have good diagnostic performance with an AUROC of 0.82 and 0.81 in the derivation cohort and validation cohort, respectively. 40
While NASH drives the progression of liver disease, it is evident that the key prognostic feature in NAFLD is the stage of fibrosis.41–43 Therefore, the concept of at-risk NASH or fibrotic NASH was conceived, whereby patients with NASH with NAFLD activity score (NAS) ⩾ 4 and fibrosis stage ⩾ F2 were considered to be at increased risk of liver disease progression and adverse outcomes. Fibrotic NASH or at-risk NASH has become the recent diagnostic target of interest for inclusion into NASH clinical trials. Boursier et al. 44 developed a new score called MACK-3, which uses homeostasis model assessment of insulin resistance (HOMA-IR) and serum AST and CK18 levels, for the diagnosis of fibrotic NASH. In the study, MACK-3 achieved diagnostic accuracy of 93.2% for fibrotic NASH, which was superior to the Fibrosis-4 index (FIB-4) and the NAFLD fibrosis score (NFS; see the section ‘Non-invasive assessment of Hepatic fibrosis’ for more details on FIB-4 and NFS). An external validation study of MACK-3 similarly showed promising result in identifying patients with fibrotic NASH with diagnostic accuracy of 79.1%, which was comparable with FIB-4 and NFS. 45 The NIS4 panel, a novel blood-based biomarker incorporating microRNA (miR)-34a-5p, alpha-2-macroglobulin, YKL-40, and hemoglobin A1c, was recently developed to assess at-risk NASH and demonstrated AUROC of 0.76–0.83. 46 Corey et al. 47 identified a single protein (ADAMTSL2) and developed an 8-protein panel called the NAFLD Fibrosis Protein Panel (NFPP) that classified at-risk NASH with an AUROC of 0.83–0.86 for ADAMTSL2 and AUROC of 0.87–0.89 for NFPP in the at-risk population group and was found superior to NFS and comparable with FIB-4. The performances of the aforementioned blood-based biomarkers and scores for NASH or fibrotic NASH is summarized in Table 2. Some recent studies have focused on ‘omic markers’48–50 to corroborate the findings of raised production of lipidomic,51,52 proteomic, metabolomics, 53 and microbiome markers 54 in patients with NAFLD. Nevertheless, these require further validation; and while attempting to improve accuracy, the key to obtaining an effective non-invasive biomarker to diagnose at-risk or fibrotic NASH in clinical practice is for it to be simple, low cost, and easily available.
Performance of blood-based biomarkers or scores for NASH or fibrotic NASH.
ADAMTSL2, A disintegrin and metalloproteinase with thrombospondin motifs like 2; AST, aspartate aminotransferase; AUROC, area under the receiver operating characteristic curves; HbA1C, glycosylated hemoglobin; HOMA-IR, homeostasis model assessment-estimated insulin resistance; NA, not available; NASH, non-alcoholic steatohepatitis; NPV, negative predictive value; PPV, positive predictive value; SPMS, SOMAmer-pulldown mass spectrometry.
Novel scores incorporating imaging biomarker for diagnosis of fibrotic NASH have emerged. The Fibroscan-AST (FAST) score uses CAP and liver stiffness measurement obtained by Fibroscan and serum AST level to determine the risk for fibrotic NASH. 55 It was developed using data from multiple centers in the United Kingdom before validated across multiple international centers. The FAST score was good for the diagnosis of fibrotic NASH with AUROC of 0.80 in the derivation cohort and 0.85 in the overall external validation cohorts. The cut-off was 0.35 for ⩾ 90% sensitivity and 0.67 for ⩾ 90% specificity, leading to a positive predictive value of 0.83 and a negative predictive value of 0.85 in the derivation cohort. The positive predictive value ranged from 0.33 to 0.81 while the negative predictive value ranged from 0.73 to 1.0 in the external validation cohorts. The FAST score has been used as a pre-screening test to reduce screen failure in NASH clinical trials. 55 Although the ⩾ 0.67 cut-off can significantly reduce screen failure rate to 16.8%, it is associated with a high missed case rate of 51.7%. 55 This is an important consideration especially if it is considered as a tool to select patients for treatment when pharmacological therapy becomes available in the near future. Similar to the FAST score, the MRI-AST (MAST) score combines results of MRI-PDFF and magnetic resonance elastography (MRE) and serum AST level to diagnose fibrotic NASH. It had an impressive AUROC of 0.93, better than the FAST score, FIB-4, and NFS. 56 However, as mentioned earlier, MRI-based techniques are not routinely used in clinical practice due to cost and unavailability, and its use is mainly limited to clinical trial for now.
Non-invasive assessment of hepatic fibrosis
As mentioned earlier, the degree of liver fibrosis remains the key predictor of future liver-related morbidity and mortality in patients with NAFLD. Major clinical practice guidelines on NAFLD or MAFLD have recommended the use of non-invasive scores to rule out advanced fibrosis (defined as fibrosis stage ⩾ F3).57–59 The non-invasive scores include simple fibrosis scores such as the AST-to-platelet ratio index (APRI), fibrosis-4 index (FIB-4), NAFLD fibrosis score (NFS), and BMI, AST/ALT ratio, diabetes (BARD) score, as well as patented biomarker panels such as the enhanced liver fibrosis (ELF), FibroMeter®, and FibroTest™. APRI, FIB-4, NFS, and BARD are based on simple laboratory parameters, which are easily available in routine clinical practice and inexpensive. A recent individual patient data meta-analysis (IPDMA) by Mózes et al. 60 showed that FIB-4, NFS, and APRI had a summary AUROC of 0.76, 0.73, and 0.70, respectively. This IPDMA also revealed that single cut-off for FIB-4 and NFS had neither sufficiently high sensitivity nor specificity for the diagnosis of advanced fibrosis, but the performance of these scores improved with paired cut-offs. A systematic review and meta-analysis by Ismaiel et al. 61 showed that FIB-4 predicted advanced fibrosis more precisely with summary AUROC of 0.82 compared with NFS (AUROC of 0.79), APRI (AUROC of 0.76), and BARD (AUROC of 0.67). Another meta-analysis by Xiao et al. 62 concluded that among the simple non-invasive blood scores, NFS and FIB-4 offer the best diagnostic performance for detecting advanced fibrosis, whereas BARD score had lower diagnostic accuracy for detecting both significant fibrosis (defined as fibrosis stage ⩾ F2) and advanced fibrosis compared with FIB-4, NFS, and APRI. Castellana et al. 63 demonstrated in a meta-analysis comparing FIB-4 and NFS head-to-head that using the dual threshold approach, FIB-4 resulted in lower rate of indeterminate findings, with mean sensitivity and specificity of 65% and 93%, respectively. Both NFS and FIB-4 had low accuracy in patients aged less than 35 and more than 65 years old, hence age-adjusted cut-offs have been proposed (i.e. 2.0 for FIB-4 and 0.12 for NFS) to enhance accuracy in older patients ⩾ 65 years of age. Furthermore, NFS has been found to be influenced by BMI and a modified NFS with BMI limited to 40 kg/m² was found to improve its AUROC from 0.75 to 0.84. 64
The Hepamet fibrosis scoring (HFS) was developed and validated by Ampuero et al. 65 and is one of newer non-invasive fibrosis scores, which appeared better compared with FIB-4 and NFS for the diagnosis of advanced fibrosis. The HFS had an AUROC of 0.85 for diagnosis of advanced fibrosis compared with 0.80 and 0.78 with the FIB-4 and NFS, respectively, and used < 0.12 cut-off to rule out advanced fibrosis and ⩾ 0.47 cut-off to rule in advanced fibrosis. The HFS was developed using data from nearly 2500 patients from five countries (Spain, France, Italy, Cuba, and China), comprising various ethnicities (Caucasian, Latin, and Asian populations) and diverse rates of baseline features (diabetes, obesity, the prevalence of fibrosis). Hence, HFS can be seen as a robust scoring system as it performed equal among these various cohorts and did not require age adjustment. Furthermore, a large multi-center cohort study corroborated the fact that HFS showed the highest performance for the identification of significant fibrosis and advanced fibrosis compared with FIB-4, NFS, and APRI, although FIB-4 and NFS performed the best for the diagnosis of cirrhosis. 66 Recently, Prasoppokakorn et al. reported on the Fibrosis-8 (FIB-8) score [the FIB-4 variables plus BMI, albumin to globulin ratio, serum gamma glutamyl transpeptidase (GGT) level and the presence of T2DM] for the diagnosis of significant fibrosis using data of biopsy-proven NAFLD patients from three Asian centers (Thailand, Hong Kong, and Malaysia). FIB-8 was shown to have AUROC of 0.77 with sensitivity and specificity of 92.4% (using lower cut-off of 0.88) and 67.5% (using higher cut-off of 1.77), respectively. 67
The ELF test consists of three specific fibrosis biomarkers, which include hyaluronic acid, type III procollagen amino terminal propeptide (PIIINP), and tissue inhibitor of metalloproteinases-1 (TIMP-1). This patented panel of direct markers of fibrogenesis has so far shown good diagnostic accuracy in detecting advanced fibrosis with a summary AUROC of 0.83, mean sensitivity and specificity of 65% and 86%, respectively, using the high threshold 9.8, based on a systematic review and meta-analysis by Vali et al. 68 In the NICE guidelines, ELF has been incorporated as part of a two-step algorithmic pathway to reduce unnecessary referral to secondary care for advanced liver disease, whereby it is performed when the FIB-4 results are indeterminate in the primary care setting. However, it has limited availability outside the United Kingdom and is costly.
Fibrometer® and FibroTest™ are other patented serum biomarker panels that have shown some promise in terms of diagnostic performance in detecting advanced fibrosis. Meta-analysis by Van Dijk et al, which included 12 studies with total of 3425 patients, on the three different versions of FibroMeter® tests, which is calculated based on patented formula (including the one with combination of the liver stiffness measurement by the vibration controlled transient elastography, that is, FibroMeter® VCTE), showed that FibroMeter® tests have adequate sensitivity and specificity in detecting advanced fibrosis in patients with NAFLD with a summary AUROC of 0.82 for FibroMeter® NAFLD, 0.89 for FibroMeter® Virus second generation (V2G), and 0.94 for FibroMeter® VCTE. No statistically significant differences between the three versions of the FibroMeter® were observed in this meta-analysis. 69 As for the FibroTest™, a recent meta-analysis, which comprised of five studies with total 2013 patients showed an acceptable diagnostic performance in detecting cirrhosis (AUROC 0.92) in NAFLD patients but the summary AUROC for detecting significant fibrosis and advanced fibrosis in NAFLD were 0.77 for both target conditions. 70 Poynard et al. 71 demonstrated in a meta-analysis of individual patient data, using 494 patients from three prospective cohorts with severe obesity (BMI > 35 kg/m²) and prevalence of advanced fibrosis of 9.9% that the mean weighted AUROC of FibroTest™ to detect advanced fibrosis was 0.85 [95% confidence interval (CI): 0.83–0.87]. Once again, these tests are costly and not routinely available.
PRO-C3, a neo-epitope-specific competitive enzyme-linked immunosorbent assay (ELISA) for PIIINP, is a relatively new direct marker of active fibrogenesis, which was initially reported by Hansen et al. 72 in patients with chronic hepatitis C. With regards to utilization of PRO-C3 as a specific serum biomarker for measuring degree of fibrosis in NAFLD, a recent systematic review and meta-analysis by Mak et al. 73 showed a summary AUROC of 0.81 for detecting significant fibrosis (mean sensitivity and specificity of 68% and 79%, respectively) and 0.79 for detecting advanced fibrosis (mean sensitivity and specificity of 72% and 73%, respectively). These findings lend support to the utilization of PRO-C3 as an effective candidate biomarker for non-invasive assessment of liver fibrosis in NAFLD. Subsequently, an algorithm called ADAPT, which incorporated PRO-C3 together with age, presence of diabetes, and platelet count was developed to detect advanced fibrosis in a multinational (Australia, the United Kingdom, and Japan) retrospective study with 431 biopsy-proven NAFLD patients with an overall AUROC of 0.86 and negative predictive value of 96.6%, which was superior to PRO-C3 as a standalone marker and the existing simple fibrosis scores, namely APRI, FIB-4, and NFS. 74 This finding was corroborated by Nielsen et al. 75 in 517 adults with biopsy-proven NAFLD, demonstrating ADAPT algorithm outperforming APRI, FIB-4, and AST/ALT ratio in detecting both significant fibrosis and advanced fibrosis with AUROC of 0.76 and 0.80, respectively. In summary, all of the clinical scores based on simple fibrosis serum biomarkers are cheap, reproducible and has high sensitivity to rule out advanced fibrosis. On the contrary, the specific fibrosis serum biomarkers are better at identifying patients with significant fibrosis and advanced fibrosis but are generally more costly and not routinely available. Furthermore, most of the non-invasive serum biomarkers for fibrosis were developed and validated in the secondary and tertiary settings with considerably higher prevalence of advanced fibrosis compared with the general population, hence limiting applicability at the primary care setting. The performances of the more commonly used non-invasive scores for detection of fibrosis in NAFLD are summarized in Table 3.
Performance of blood-based biomarkers or scores for fibrosis in NAFLD.
AF, advanced fibrosis; ALT, alanine aminotransferase; APRI, AST-to-platelet index; AST, aspartate aminotransferase; AUROC, areas under the receiver operating characteristic curves; BARD, body mass index (BMI), AST/ALT ratio, and diabetes score; ELF, enhanced liver fibrosis score; FIB-4, fibrosis-4 index; HFS, hepatic fibrosis score; NA, not available; NAFLD, non-alcoholic fatty liver disease; NFS, NAFLD fibrosis score; NLR, negative likelihood ratio; NPV, negative predictive value; PLR, positive likelihood ratio; PPV, positive predictive value; PRO-C3, neo-epitope pro-peptide of type III collagen formation; SF, significant fibrosis.
Liver stiffness measurement has emerged as a very useful non-invasive method for the assessment of liver fibrosis for chronic liver disease of various etiologies, including MAFLD. Initially, different cut-offs have been proposed for the diagnosis of the different stages of fibrosis for the different chronic liver diseases. Subsequently, the concept of compensated advanced chronic liver disease (cACLD) was introduced, 76 whereby the spectrum of severe fibrosis and cirrhosis was considered a continuum in asymptomatic patients and differentiating the two was considered as often not possible on clinical grounds. Liver stiffness measurement < 10 kPa in the absence of other clinical or imaging signs can be used to exclude cACLD, 10–15 kPa is suggestive of cACLD and > 15 kPa is highly suggestive of cACLD. In non-obese NASH-related cACLD, liver stiffness measurement of ⩾ 25 kPa can be used to rule in clinically significant portal hypertension, while patients with liver stiffness measurement 15–20 kPa and platelet count < 110 × 109/L or liver stiffness measurement 20–25 kPa and platelet count < 150 × 109/L have ⩾ 60% chance of clinically significant portal hypertension. 76 As mentioned earlier, the XL probe was introduced to reduce failed or unreliable examinations. The same liver stiffness measurement cut-offs can be used for the M probe and the XL probe without adjustment for steatosis when the appropriate probe has been used. 77 The Fibroscan-based Agile scores, that is, Agile 3+ and Agile 4, combine liver stiffness measurement and routinely available clinical parameters to improve the diagnosis of advanced fibrosis and cirrhosis, respectively, and performed better than liver stiffness measurement alone and FIB-4 alone. 78 As mentioned earlier, the FibroTouch is another vibration controlled transient elastography device, and it can be similarly used for liver stiffness measurement. The optimal cut-off for the diagnosis of advanced fibrosis has been found to be 9.4 kPa. 27 The FibroTouch has the advantage of using a universal probe, less invalid and more consistent measurements and shorter examination time but it is much less studied compared with the Fibroscan. Similar to attenuation parameter, liver stiffness measurement obtained using the FibroTouch strongly correlated with the Fibroscan; however, liver stiffness measurement (and attenuation parameter) obtained using FibroTouch tended to be higher at lower values and lower at higher values. 28 MRE more accurately classifies fibrosis than liver stiffness measurement by transient elastography; 31 however, as mentioned earlier, its use in routine clinical practice is limited by cost and availability.
Use of sequential non-invasive tests for risk stratification
One of the main challenges in MAFLD is the very high prevalence of the disease in the general population, with only a small yet significant proportion of patients having advanced liver fibrosis. 79 In this context, how do we ensure that patients with more advanced liver fibrosis are identified and referred for specialist care? On the contrary, how do we minimize unnecessary referrals of patients without significant liver disease to specialist clinics? One potential strategy is to use the two-step approach, which was initially described as using the NFS, followed by liver stiffness measurement for patients with intermediate or high NFS. 80 In the study, the NFS was found to have a very high negative predictive value, and it could be used to exclude advanced liver fibrosis. However, the use of both the NFS and liver stiffness measurement for all patients resulted in a larger proportion of patients with indeterminate or discordant results due to the substantial number of patients with low NFS having elevated liver stiffness measurement ⩾ 8 kPa. The two-step approach was subsequently validated and refined in a multicentre study, with two notable changes. 81 First, the FIB-4 score was preferred as it had similar performance as the NFS but required less variable and was easier to use. Second, the 10 kPa and 15 kPa cut-offs were better compared with the 8 kPa and 17 kPa for the diagnosis of absence and presence of advanced liver fibrosis, respectively. The strength of the two-step approach is that the first test is cheap, readily available, and simple to use, and it would exclude more than two thirds of patients from the need to have further assessment. Furthermore, liver stiffness measurement is increasingly available in specialist clinics and becoming an integral part of management of patients with chronic liver disease. 76
Risk stratification for MAFLD patients can be further achieved by focusing on patients with T2DM. It has been recognized that T2DM is an independent risk factor for more severe MAFLD, including for NASH as well as for advanced liver fibrosis. 82 There is a high prevalence of significant hepatic steatosis and advanced liver fibrosis among patients with T2DM. Furthermore, among patients with T2DM and liver stiffness measurement ⩾ 8 kPa who were referred to a specialist clinic and underwent liver biopsy, majority were found to have NASH and some degree of liver fibrosis. 83 As diabetes clinics have long been established to assess patients with diabetes for various associated complications, co-localizing liver assessment in diabetes clinics is a reasonable and pragmatic approach. For example, liver assessment has been included in the clinical practise guidelines for the management of T2DM in Malaysia, including the use of the two-step approach with FIB-4 and liver stiffness measurement described above (see Figure 1).84,85 However, as the prevalence of advanced liver fibrosis is higher among patients with T2DM, there has been concern about underperformance of fibrosis scores in a two-step approach and the direct use of liver stiffness measurement has been considered. 86 This requires further studies and would also depend very much on the availability of local resources.

Example of an algorithm utilizing sequential non-invasive tests for screening for more severe metabolic dysfunction–associated fatty liver disease (MAFLD) among patients with type 2 diabetes mellitus (based on the Malaysian Society of Gastroenterology and Hepatology Consensus Statement on MAFLD 85 ).
A recent individual patient data meta-analysis of 5735 patients found the sequential combination of FIB-4 (< 1.3) and liver stiffness measurement (<8 kPa) had a false negative rate of 9% for advanced liver fibrosis. About 33% of patients had FIB-4 ⩾ 1.3 and < 2.67 and liver stiffness measurement ⩾ 8 kPa, or FIB4 score ⩾ 2.67, were considered as requiring a liver biopsy. In addition, using upper cut-offs of FIB-4 of ⩾ 3.48 and liver stiffness measurement of ⩾ 20 kPa to diagnose cirrhosis with specificity of 95% can reduce the need for liver biopsy from 33% to 19%. 60
Instead of liver stiffness measurement, an additional blood-based biomarker has also been explored as the second test in a two-step approach. The use of the ELF test in patients with indeterminate FIB-4 score was found to reduce unnecessary referrals to specialist clinics by 81% and led to a fivefold increase in accurate referrals of patients with advanced liver fibrosis. 87 However, the cost and limited availability of ELF test may limit their widespread use in non-specialist clinics.
Non-invasive tests for prognostication
In a systematic review and meta-analysis by Taylor et al., 43 which included 13 studies with a total of 4428 patients with NAFLD, it was established that fibrosis stage is linked with liver outcome in NAFLD regardless of confounders, including age or sex, as well as in the subgroup of NAFLD patients with NASH. Therefore, it is conceivable that non-invasive tests for fibrosis would have prognostic value. Angulo et al. 41 demonstrated in an international multicenter study of 320 biopsy-diagnosed NAFLD patients, the fibrosis scores NFS, FIB-4, and APRI were able to predict all-cause mortality over a median of 8.7 years follow-up but the study was not large enough to examine disease-specific mortality. Xun et al. demonstrated among Chinese patients with NAFLD, the NFS showed good predictive value of 6.6 years overall mortality, but once again, not able to show disease-specific mortality. 41 Subsequently, a larger population-based, prospective survey in the United States, with up to 23 years of linked-mortality data showed that intermediate to high fibrosis scores based on APRI, FIB-4, and NFS among persons without viral hepatitis, which includes NAFLD, resulted in increased overall and liver disease mortality. 88 Recently, Lee et al. 89 concurred in their systematic review that FIB-4, NFS, and APRI showed consistently good ability to prognosticate liver-related events among adults with NAFLD. In terms of prognosticating mortality, it was showed that FIB-4 and NFS outperformed APRI score. Serial use of FIB-4 and NFS can be considered in clinical practice as both these scores correlate well with disease progression, hence enhancing risk stratification among patients with NAFLD with almost comparable performance with a liver biopsy. Nevertheless, none of these fibrosis markers had consistent accuracy in predicting change in fibrosis stage. 89 It could be postulated that serum biomarkers that directly evaluate components of extracellular matrix, fibrogenesis or fibrinolysis may be more precise in prognostication than indirect serum markers. The ELF test demonstrated a relatively good predictive value in liver-related morbidity and mortality in patients with chronic liver disease of mixed etiology, including NAFLD, hence may be a valuable prognostic tool in clinical practice. 90 In a cohort of patients with NASH and advanced fibrosis, the baseline ELF score better predicted progression from bridging fibrosis to cirrhosis and from cirrhosis to clinical events than with histology. However, change in ELF score did not further improve prediction compared with baseline values. 91 Liver stiffness measurement using Fibroscan has been shown to categorize NAFLD patients into subgroups with different prognoses and this has been suggested to provide better risk stratification and prognostication than histology. 92 A study found that high-risk patients based on repeat liver stiffness measurement, similar to repeat liver biopsy, had significantly greater liver-related complications. 93 MRE has also been shown to predict development of cirrhosis in non-cirrhotic NAFLD patients and the development of decompensation and death in cirrhotic NAFLD patients. 94
Non-invasive tests for monitoring of treatment response
There is currently no FDA-approved pharmacotherapy for MAFLD. Demonstration of greater NASH resolution without worsening fibrosis or fibrosis improvement without worsening NASH are pre-requisites for conditional approval of a study drug in NASH clinical trials, with full approval being granted when the results of the extended study period show benefits in clinical outcomes. However, there is increasing recognition of the limitation in the current method for evaluating changes in repeat liver biopsy in clinical trials. For example, a study found that the inter-observer agreement for NASH resolution without worsening fibrosis and for fibrosis improvement without worsening NASH were only 0.40 and 0.37, respectively. 95 Furthermore, evaluating treatment response using repeat liver biopsy is not practical outside the context of a clinical trial. Therefore, development of non-invasive tests for monitoring of treatment response (besides for identifying at risk patients) is an active area of research and forms an integral part of many on-going NASH clinical trials. A secondary analysis of data from the Farsenoid X Receptor (FXR) Ligand Obeticholic Acid in Non-alcoholic Steatohepatitis (NASH) Treatment (FLINT) trial identified a decrease in serum ALT level of 17 U/L or more to be significantly associated with histologic response. 96 CK-18 fragments have been studied to see if the dynamic changes in its serum level could translate into meaningful prognostication for histological NAS improvement or resolution in NASH. However, data from clinical trial concluded that level of changes in serum CK-18 did not add value to changes in ALT with regards to assessing improvement in liver histology. 97 Longitudinal reductions in liver fat content based on MRI-PDFF has been found to be associated with histologic response.98–100 A secondary analysis of data from the MOZART trial found that a relative reduction of 29% in liver fat content on MRI-PDFF was associated with histologic response (defined as ⩾ 2-point reduction in NAS without worsening fibrosis). 98 A secondary analysis of data from a phase II study of selonsertib demonstrated that any reduction in MRI-PDFF was predictive of steatosis improvement on liver biopsy with an AUROC of 0.70 (95% CI: 0.57–0.83). 99 The AUROC of MRI-PDFF to predict NAS response (defined as ⩾ 2-point reduction) was 0.70 (95% CI: 0.51–0.89) and the optimal threshold was ⩾ 25% relative reduction in MRI-PDFF from baseline. This study also found that any reduction in MRE was predictive of fibrosis stage improvement with an AUROC of 0.62 (95% CI: 0.46–0.78). A secondary analysis of data from the FLINT trial demonstrated that the optimal cut-off for relative decline in liver fat content on MRI-PDFF for histologic response (defined as ⩾ 2-point reduction in NAS without worsening fibrosis) was 30%. 100 MRI-PDFF responders (defined as those who achieved ⩾ 30% decline in MRI-PDFF relative to baseline) had significantly higher odds of histologic response. A subsequent meta-analysis that included seven studies and 346 subjects demonstrated that a ⩾ 30% decline in MRI-PDFF was associated with higher odds of histologic response (defined as ⩾ 2-point reduction in NAS with ⩾ 1-point reduction in lobular inflammation or ballooning) and NASH resolution. Further studies are needed to identify non-invasive tests and their threshold changes that can be reliably used to monitor treatment response when pharmacological treatment for MAFLD becomes available in the near future.
Conclusion
In conclusion, although ultrasonography is the most widely used tests for the diagnosis of fatty liver, numerous other tests are available and have their respective roles in the armamentarium for non-invasive assessment of MAFLD; from simple blood-based scores for diagnosis of hepatic steatosis in epidemiological studies, to simultaneous diagnosis of hepatic steatosis using attenuation parameter when performing liver stiffness measurement for patients with other chronic liver diseases, to the highly accurate MRI-based techniques for quantification of hepatic steatosis and fibrosis for clinical trial purposes; and from combination of blood-based and imaging biomarkers for diagnosis of fibrotic NASH to the use of simple blood-based fibrosis scores to exclude advanced liver fibrosis followed by liver stiffness measurement for intermediate- and high-risk patients for further risk stratification and prognostication. Further studies are needed to refine the use of these non-invasive tests, particularly for selection of patients for pharmacological treatment and monitoring their response to the treatment.
