Abstract
Background
As one of the most requested profiles of blood tests, there is a need for standardization among liver function tests (LFT). Alanine aminotransferase (ALT) and aspartate aminotransferase (AST) are key markers of hepatocellular injury. ALT and AST are used to calculate a Fibrosis-4 (FIB-4) score for assessing liver fibrosis. Despite recommendations by the International Federation of Clinical Chemistry (IFCC) to include pyridoxal-5-phosphate in ALT and AST assay methodologies, most laboratories continue to omit this.
Methods
Data from the UK NEQAS for Clinical Chemistry Scheme, Distribution 1160 (November 2023), was reviewed to investigate variation in practice regarding liver blood tests in relation to ALT, AST and FIB-4. In addition, a series of questions audited laboratory practice in relation to liver enzymes.
Results
Wide variation was seen in LFT profiles offered by laboratories, with 32 different combinations of tests used. The IFCC-recommended methods for ALT and AST are used by one-third of laboratories and give significantly higher results than non-IFCC methods. Laboratories using IFCC methods also reported significantly higher FIB-4 scores. Reference ranges and cut-offs for these tests also varied, and did not account for method-related differences in results.
Conclusions
The lack of standardization of LFTs can have a significant impact on patient care. The difference in results for ALT, AST and FIB-4 in laboratories not using IFCC-recommended methods may lead to misdiagnosis. This issue should be addressed by laboratories using methods including pyridoxal-5-phosphate. Until then, method-related reference ranges and cut-offs for ALT, AST and FIB-4 are required.
Keywords
Introduction
When investigating and monitoring liver disease, blood tests are an essential part of the workup of a patient. Liver disease is often asymptomatic or associated with non-specific symptoms, meaning blood tests can identify cases before the development of late-stage disease. As a result, liver function tests (LFTs) are the third most commonly requested test in primary care, after renal function tests and full blood counts.2,3 This suggests a requirement for standardization of liver blood testing profiles across the UK. The British Society of Gastroenterology (BSG) published their
Initial investigations for liver disease are designed to have a high sensitivity for detecting disease, while also not having too high a false positive rate. The BSG guidelines recommend that a liver blood test panel should include bilirubin, albumin, ALT, ALP and GGT, along with a full blood count. 4 This offers a high sensitivity for detecting liver disease, with the benefits considered to outweigh the associated implications of any false positives.
The main tests used as markers of hepatocellular injury are alanine aminotransferase (ALT) and aspartate aminotransferase (AST). These are enzymes found in hepatocytes, with the increased specificity of ALT for the liver making it the preferred test for initial investigations. In clinical laboratories, ALT and AST activities are determined in serum by measuring the rate of transfer of an amino group from alanine or aspartate to 2-oxoglutarate. These reactions are coupled to the oxidation of NADH to NAD+, which can be measured by the decrease in absorbance at a wavelength of 340 nm. The International Federation of Clinical Chemistry (IFCC) recommends that pyridoxal-5-phosphate (PLP) is added to the reaction and allowed to saturate the enzyme. 5 PLP is a co-factor for both enzymes, converting the inactive apoenzyme (inactive protein of enzyme) into the active holoenzyme (functional enzyme) which is able to catalyse the transfer of amino groups. Therefore, the addition of PLP to the reaction ensures that any PLP deficiency in patient serum does not affect the observed enzyme activity.
PLP is the active form of vitamin B6, a water-soluble vitamin found in many foods. The National Diet and Nutrition Survey Rolling Programme has found vitamin B6 deficiency to be present in 10% of the UK population.
6
Deficiency is often associated with increasing age, alcoholism and chronic liver disease.6–8 As these are patients who are likely to have ALT and AST activities measured, this indicates the importance of considering the effect of vitamin B6 deficiency on the results obtained. Patients with vitamin B6 deficiency have been shown to have lower transaminase results when PLP is not supplemented
The prevalence of non-alcoholic fatty liver disease (NAFLD) is rising with the increasing rates of obesity in the UK, with approximately 25%–33% of the UK population estimated to have NAFLD.16,17 While most cases of NAFLD can be managed in primary care with lifestyle changes, those with liver fibrosis require treatment to prevent progression to end-stage liver disease. 17 Liver biopsy is the gold standard for detecting fibrosis; however, there is increasing evidence for the use of liver blood tests in algorithms for non-invasive assessment of fibrosis. Several non-invasive markers of fibrosis have been validated including FibroScan, Fibrosis-4 (FIB-4) score, NAFLD fibrosis score (NFS) and Enhanced Liver Fibrosis (ELF) testing.18–21 Of these, FIB-4 has been suggested to be one of the most useful tests for detecting advanced fibrosis. 17
The FIB-4 score is recommended by the BSG for first-line assessment of fibrosis in patients with NAFLD. 4 The score is calculated as: (age × AST)/(platelets × (√ALT)). The BSG recommends that low FIB-4 scores (<1.3 for those aged <65 years or <2 for those >65 years) are managed in primary care, and elevated FIB-4 scores (>3.25) should be considered for referral to a hepatologist. 4 Patients with intermediate scores between these two cut-offs are recommended to have further testing, such as ELF testing or Fibroscan. 4 FIB-4 is also recommended for assessment of those with NAFLD in the National Institute for Health and Care Excellence (NICE) Clinical Knowledge Summary, which recommends that an FIB-4 score >2.67 suggests advanced liver fibrosis and requires referral to hepatology. 22 FIB-4 is increasingly being set up as a requestable test in Laboratory Information Management Systems (LIMS), which can calculate scores based on the ALT, AST and platelet counts measured in the laboratory; the age comes from the patient demographics. Despite it being well established that including PLP in ALT and AST methods has an effect on the results obtained, there is limited data about the effect of this on the calculated FIB-4 score.
This study aimed to investigate the variation in practice regarding ALT, AST and FIB-4 liver blood tests. This includes the variation in LFT profiles offered by laboratories, along with method-dependent effects on ALT, AST and FIB-4 results, and also the reference ranges and cut-offs used by laboratories for these tests.
Methods
Scheme design for FIB-4 within the UK NEQAS for Clinical Chemistry Scheme
Three off-the-clot serum specimens are distributed bimonthly (every 2 weeks) to participants of the UK NEQAS for Clinical Chemistry Scheme. Patient demographics (sex and age) are provided to book the specimens into the participant’s LIMS to allow calculation of FIB-4 results. A platelet count is also provided. Laboratories analyse the specimens for all analytes that they are registered for within the scheme. The ALT and AST values, along with the patient age and platelet count, are used by the participant to calculate the FIB-4 result by the same equation as used for patient specimens.
At Distribution 1160 (November 2023), individual 0.5 mL aliquots of pooled serum were distributed to participants. The pooled serum was itself composed of remnant serum taken from several hundred gel SST tubes which had been stored at 4°C–8°C for a maximum of 7 days. The remnant serum was from clinical specimens which had previously been analysed in a clinical laboratory and found to have raised ALT results and ‘binned’ according to ALT activity. This was then stored frozen at −80°C before being processed by Birmingham Quality. Each pool of serum was filtered to 1 um using Whatman Filter Paper (Merck, Gillingham, UK) to remove any clots. Gentamicin (DEMO S.A. Pharmaceutical Industry, UHB Pharmacy) was added as a preservative to a concentration of 100 mg/L). Specimens were stored frozen at −40°C until dispatch with participants advised to analyse immediately as if the specimens were from a patient.
The specimens were labelled, 1160A, 1160B and 1160C and distributed to participants in 1.5 mL brown amber tubes (Sarstedt, Leicester, UK). At Distribution 1160 specimens were sent to approximately 700 Participants, the majority being based in the UK. For each specimen, participants report back online (manually or automatically), to the organizing centre at Birmingham Quality, the results for all analytes that they are registered for. Participants are also able to check the method that they are using and can change if required to ensure that they are registered for the correct method.
Data analysis
All data is subject to statistical analysis. ALT and AST results are evaluated at both the method principle and at the manufacturer analyser level. Method principles refer to the use of either the IFCC method (with PLP supplementation) or the non-IFCC method (without PLP supplementation). Laboratories using dry slide methodologies are classed as a third separate method principle. Within these method principles, laboratories are classified by the manufacturer analyser used, such as Abbott Alinity or Roche Cobas. A mean, standard deviation and coefficient of variation is calculated for each method. The target value used for ALT and AST is the Method Laboratory Trimmed Mean (MLTM), which includes all laboratories using the same manufacturer and method principle, with the highest and lowest 5% of results removed by Healy trimming.
23
It is worth noting that this does not automatically imply that the MLTM is the ‘correct answer’, but it is used as a target value since the correct answer is not readily available.
As FIB-4 is a derived analyte, it does not in itself have an associated method for analysis. Therefore, participants are categorised based on their method groups for ALT and AST. If both analytes are measured using IFCC methods, then participants are classified into an IFCC method principle for FIB-4. Participants using non-IFCC methods for one or both of ALT and AST are classified into a non-IFCC method principle. Further classification into method groups based on the manufacturer analyser are the same as for AST and ALT. Once again the target value is the MLTM, despite this not necessarily being the ‘correct result’.
A comparison was performed for the ALT, AST and FIB-4 results between the IFCC and non-IFCC method principles for each manufacturer. Any method group which had a minimum of 4 results contributing to its method mean was selected for analysis. Differences between method principles were analysed with unpaired
Survey into the use of liver function tests
Questions asked to all participants of the UK NEQAS for Clinical Chemistry Scheme in a question-and-answer section included at Distribution 1160.
Questions 7–13 applied to those participants that report FIB-4 scores. Since the formation of pathology networks in the UK as part of the NHS Long Term Plan, 24 several laboratories entered in the Scheme belong to the same network. Each laboratory code registered to the Scheme was counted as an individual response to the Q&A, in order to better reflect workload, but this may be an underestimation as some laboratories may have only submitted one response and not indicated which other laboratory codes their replies applied to. Question 5 related to tests included in a liver function test profile. The data was reviewed for UK participants only as different countries may have their own requirements which could skew the data.
Results
Method-related variation in ALT and AST
For Specimens at Distribution 1160, 601 participants returned results for ALT and 446 participants returned results for AST. To investigate the difference between IFCC and non-IFCC method principles, only the methods for which there were two or more users for both the IFCC and non-IFCC method were studied. This included 536 laboratories for ALT and 374 laboratories for AST. Of these, 185 (35%) laboratories submitted ALT results using an IFCC method, and 125 (33%) submitted AST results using an IFCC method.
Figure 1 shows box and whisker plots of the distribution of results submitted for ALT and AST for methods which had data for both IFCC and non-IFCC method principles. IFCC methods are coloured green and non-IFCC methods are orange. This demonstrates that the IFCC methods for both ALT and AST produce higher results than the non-IFCC methods on the same analysers. This difference between method principles is greater for AST than ALT and is also more noticeable for ALT with the Abbott and Roche Cobas methods than the Beckman AU and Siemens Atellica methods. AST and ALT results for Specimens 1160A, 1160B and 1160C, with each manufacturer analyser split into the IFCC and non-IFCC method principle. The mean score for each group is shown as an ‘X’, while the box and whisker shows the median, interquartile range and 5th and 95th percentiles.
Unpaired
Method-related variation in FIB-4
FIB-4 results were returned by 64 participants for Specimens 1160A, 1160B and 1160C. Abbott Alinity and Roche Cobas were the only two manufacturer analysers for which there were participants classified as both IFCC and non-IFCC who returned results to allow comparative statistics to be undertaken. In total there were 41 results submitted for these methods, 19 (46%) used IFCC methods, while 22 (54%) used non-IFCC methods.
Figure 2 shows the distribution of FIB-4 results submitted by Abbott Alinity and Roche Cobas users, comparing the IFCC and non-IFCC method groups. This demonstrates that laboratories using IFCC methods for AST and ALT obtained higher FIB-4 scores than those using non-IFCC methods on both the Abbott Alinity and Roche Cobas analysers. For the Abbott Alinity, the IFCC results were significantly higher than the non-IFCC results for all three specimens ( FIB-4 scores for Specimens 1160A, 1160B and 1160C reported by Abbott Alinity and Roche Cobas users, separated into those using the IFCC method principle for both ALT and AST and those using non-IFCC method principles. The mean score for each group is shown as an ‘X’, while the box and whisker shows the median, interquartile range and 5th and 95th percentiles.
Variation in LFT profiles across UK laboratories
When asked the question ‘What tests are included in your liver function test profile?’, 298 responses were received from laboratories in the UK. There were 32 different combinations of tests mentioned in these responses. Figure 3 shows the proportion of laboratories using the most commonly answered profiles, with the less common combinations of tests being grouped together as ‘Other’. The most commonly used LFT profile, used by 94 (32%) UK laboratories who responded, is ALT, albumin, ALP and total bilirubin. Another 42 (14%) laboratories also include total protein in their LFT profiles, and a further 37 (13%) report the calculated globulin level. The next most frequently seen LFT profiles included AST, with some also including GGT. Less commonly included analytes were calcium, LDH, direct bilirubin, creatinine, electrolytes and ferritin. Range of Liver Function Test profiles used in the UK, out of 298 laboratories who responded to the UK NEQAS for Clinical Chemistry Scheme question and answer. The 11 most common profiles are shown, with the 21 least common profiles being grouped together as ‘Other’.
Variation in ALT and AST reference ranges
When asked for reference ranges used for ALT and AST, there were 354 responses regarding ALT and 342 responses regarding AST. The ALT and AST reference ranges for both adult males and adult females were provided by laboratories. It was found that 141 laboratories (40%) used the same ALT reference ranges for males and females, while for AST this was 154 laboratories (45%). The remaining 60% for ALT and 55% for AST use different reference ranges depending on sex.
Figure 4 shows the reference ranges used by laboratories for AST and ALT, split into male and female and grouped firstly by method principle (IFCC vs non-IFCC) and then further by manufacturer analyser. Each stick on the graph represents one laboratory. From this it can be seen that there is a high degree of variation in reference ranges used by laboratories, even within the same method groups. For both ALT and AST, in males there is on average a higher upper limit of the reference range in the IFCC methods than non-IFCC methods, whereas in females there is no distinct difference between the upper limit of the reference ranges in the two method principles. In both males and females for ALT and AST, there are several laboratories using non-IFCC methods who have higher upper limits of the reference range than laboratories using IFCC methods. AST and ALT reference ranges by method, separated for adult males and adult females. Method groups are sorted with the IFCC methods on the left and the non-IFCC methods on the right. The reference range used by each laboratory is shown by a single bar, with the bottom of the bar representing the lower limit of the range, and the top of the bar representing the upper limit of the range.
Variation in FIB-4 cut-offs and follow-up action
When asked whether Participants report FIB-4 scores, 106 laboratories answered ‘Yes’. As shown in Figure 5, there were a range of different cut-offs used by these laboratories. The most common cut-offs used are <1.3 defining low risk of fibrosis and >3.25 defining high risk of fibrosis, with an intermediate range between these two values. While 29 (27%) laboratories use these cut-offs for all ages, 16 (15%) laboratories use a cut-off of <2 instead of <1.3 for patients over 65. Another 21 (20%) laboratories use a cut-off of >2.67 to define a high risk of fibrosis. Varying combinations of these different cut-offs values are used. While 81 (76%) laboratories report an intermediate range between upper and lower cut-off values, there are 25 (24%) laboratories who have a single cut-off, with values below the cut-off reported as low risk and values above the cut-off reported as high risk. FIB-4 cut-offs used in different laboratories. Each laboratory is represented by one unit on the 
Laboratories were also asked whether they take any action based on FIB-4 results, to which 24 out of 109 laboratories responded ‘Yes’. This included 16 laboratories who add an Enhanced Liver Fibrosis (ELF) test on to indeterminate results, while 8 others add a comment recommending ELF/FibroScan/referral. The remaining 85 laboratories do not take any actions based on FIB-4 results.
Discussion
The BSG guidelines on the management of abnormal liver blood tests were developed in 2017 with the intention of providing ‘some’ standardization across the UK regarding liver blood testing.
4
This included recommendations on which initial tests should be included when investigating liver disease, how to follow-up abnormal tests and how to investigate the risk of liver fibrosis. Additionally, the IFCC provided recommendations for the appropriate conditions for measuring ALT and AST in serum as far back as 1985, which includes the
When requesting a ‘liver function test’ profile from the laboratory, clinicians may expect to receive a standard profile of tests across the UK. However, it was found in this study that there were 32 different combinations of laboratory tests being included in LFT profiles by laboratories across the UK. The BSG 4 recommends that ALT, albumin, ALP, total bilirubin and GGT are included in the profile, but this combination was only reported by three laboratories (1%). With LFTs being one of the most requested tests by GPs, this is an area which may affect the care of many patients across the country. 2 Laboratories using smaller LFT profiles are more likely to miss liver disease in some patients, while those with more extensive profiles are likely to see more false positives. Given that AST has the biggest impact on the FIB-4 calculation it is anomalous that AST is not in the recommended list of analytes.
Regarding the methods used for measurement of ALT and AST, only approximately one-third of laboratories were found to include PLP as is recommended by the IFCC. A significant difference in results between the IFCC and non-IFCC methods for ALT and AST has been demonstrated here. On the same manufacturer analysers, methods including
In addition to this difference in the results obtained for ALT and AST, variation was also found in the reference ranges used by different laboratories. It has been suggested that the reference ranges for ALT and AST should be higher when PLP is included compared to when it is not.
9
However, our findings show that while it is true that some laboratories using IFCC methods do have higher reference ranges, there are many laboratories using non-IFCC methods with reference ranges which are the same or higher. This demonstrates that the differences in results obtained are not always being accounted for by the reference ranges. Additionally, the effect of
Finally, variation in FIB-4 results and cut-offs were also observed. Laboratories using the IFCC method for ALT and AST reported higher FIB-4 scores than those using non-IFCC methods for all three specimens. This observed effect is likely to be due to PLP causing a greater increase in AST activity than ALT activity, which when put into the FIB-4 calculation leads to a higher FIB-4 score from IFCC methods. This finding means that the risk of liver fibrosis may appear lower in laboratories not including PLP in their ALT and AST methods, which could lead to patients not receiving the follow-up they may require. While these differences between IFCC and non-IFCC methods have been widely reported for ALT and AST, there has been limited evidence of the impact of this on FIB-4 results until now. Furthermore, a range of different cut-offs are used by laboratories for interpreting FIB-4 results. The cut-offs recommended by the BSG 4 (<1.3 if < 65 or <2 if > 65, and >3.25) were used by 15% of laboratories, with many other laboratories using modified versions of these cut-offs, such as without adjusting for age, or just using one of the cut-offs. Another 20% of laboratories use an upper cut-off of 2.67 as recommended by NICE. 22
These findings demonstrate that the same patient could receive different interpretation of their transaminase results and FIB-4 scores at different laboratories. This is due to both variation in the reference ranges used and differences in results obtained due to the use of both IFCC and non-IFCC methods. Since the non-IFCC methods give lower results, this could lead to false negatives when investigating potential liver disease or liver fibrosis. It is already believed that ALT reference ranges may be too high due to the inclusion of patients with NAFLD in the reference group, and this problem may be made worse by the use of non-IFCC methods. 15 If laboratories were all using the IFCC method, issues regarding non-harmonized reference ranges could be addressed by introducing standardized reference ranges for all laboratories to apply. However, while the majority of laboratories continue to use non-IFCC methods this is not possible. 15 Conversely, attempts to provide standardized cut-offs for FIB-4 have been made by the BSG and NICE.4,22 These cut-offs differ from each other, and as shown here have not been adopted in a harmonized manner by laboratories. However, the fact that IFCC and non-IFCC ALT and AST methods give significantly different FIB-4 scores means that the use of standardized cut-offs with no consideration of methodology is not appropriate. It does not appear to be mentioned in the development of these guidelines whether they were derived using IFCC methods including PLP for ALT and AST. Again, if laboratories were all using the IFCC methods, then the use of standardized cut-offs would allow harmonization of how FIB-4 scores are interpreted. But until this is the case, we demonstrate here that method-related cut-offs may be more appropriate for the interpretation of FIB-4 scores.
In conclusion, these results demonstrate the lack of standardization regarding laboratory testing for liver disease, and how this can impact patient care. The variation in laboratory tests may not be appreciated by clinicians, and has not been considered during the development of clinical guidelines. As long as laboratories continue to use ALT and AST methods without PLP supplementation, there will continue to be unwarranted variation in the use of blood tests for investigating liver disease and assessing liver fibrosis. The decision to use non-IFCC methods is usually due to costs; however, the impact on patient care must be considered as a priority.
Footnotes
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Ethical approval
Ethical approval is not applicable. Remnant clinical material has been used, but this is covered by the Human Tissue Act 2004 which specifies that consent is not required for use of remnant clinical material for quality assurance purposes 1 .
Guarantor
RM.
Contributorship
All authors contributed to the review of data, writing and review of manuscript.
