Sage Journals: Discover world-class research

Abstract

Autoimmune rheumatic diseases are often characterised by heterogeneity in presentation. The traditional approach to diseases guided by their phenotype may be suboptimal with the advent of precision medicine. Precision medicine is the integration and application of multiomics to predict the best-performing drug and its toxicity profile to derive optimal benefits. With novel drug discoveries and an expanding therapeutic armamentarium, it potentially aids in clinical and therapeutic decision-making, while saving time and averting adverse events. However, multiomics comes with ‘big data’, and owing to the costs, the sample size is usually small. Machine learning (ML) plays an important role in these scenarios where conventional statistics fall short. So, by integrating clinical data with the data from -omics, ML models can be built, which can accurately predict the clinical factors or even novel biomarkers that predict response. This approach has a potential for great benefit as valuable time or the ‘therapeutic window of opportunity’ would be saved, with fewer adverse events, eventually translating to lower damage accrual and better outcomes. Most of the evidence for the use of ML in precision rheumatology comes from rheumatoid arthritis and the factors predicting response to various drugs, including tumour necrosis factor inhibitors. This approach also has its limitations such as the lack of generalizability and the current scarcity of longitudinal data. These models must be tested in larger cohorts and population-based studies for validation, failing which there is a risk of apparent identification of multiple ‘novel’ biomarkers that may or may not be mechanistic.

Keywords

Machine learning precision medicine artificial intelligence multiomics

Introduction

The traditional approach to diagnosis and treatment of autoimmune rheumatic diseases (AIRDs) has been phenotype-based. With the advent of multiomics, a paradigm shift has been anticipated. Precision medicine stems from this integrated ‘systems biology’ approach, where the -omics data of the patients is integrated to predict the best-performing, tailored therapeutics and in addition, anticipate toxicity, to achieve optimal outcomes.¹ However, the big data from multiomics present a challenge of their own, stretching the limits of conventional statistics. Thus, there is a perfect niche for machine learning (ML) to contribute to precision medicine.

There is a plethora of symptoms for various AIRDs. Though standard guidelines exist, at many places the choice of drugs is left to the treating physician. Owing to the heterogeneity of the disease manifestations and the presence of comorbidities, the response to a particular treatment may not always be predictable.² This leads to a ‘trial-and-error’ approach in clinical practice, during which the therapeutic window of opportunity may be missed, eventually leading to damage accrual owing to suboptimal disease control.³ The therapeutic armamentarium is gradually expanding with addition of newer drugs with novel targets. Another issue may be a low therapeutic index of some drugs, where the clinician has to decide the individual risks before prescribing.

In the background of these issues, precision medicine has gained traction in the last decade. Though the inception of precision medicine has been mostly in the field of oncology, there is growing evidence in immunology and rheumatology.

Artificial intelligence has immense potential in healthcare and one of the major tools, the ML models, has been widely applied in many scenarios such as prediction of disease outcomes, treatment responses, prediction of disease inception, and analysis of big data, among others. ML, in simple terms, gives a system the ability to learn from experience, without being systematically programmed to do so. There are four major types of ML models – supervised, semi-supervised, unsupervised, and reinforcement learning models, with supervised models having found most use in healthcare.⁴ While traditional statistical analyses merely infer relationships between variables, ML models aim to make accurate predictions.⁵ So, by integrating the systems biology approach, combining different –omics data with supported bioinformatics analysis from ML models, we can predict, treat and monitor these diseases, translating to better clinical outcomes and lower overall damage accrual.

Cracking the ‘code’ of the right biomarker, the right genome, and the right drug that will have the best effect on a patient, will refute any ambiguity in choosing treatment, and also avoid the multiple trials needed to find the ‘right fit’ to obtain optimal response. Traditional statistical analysis may fall short in analysing the big data required for the robust implementation of precision or personalised medicine. ML is a great leap for science in this area, as it can accurately predict the response to treatment while integrating the other available data. This would also lead to significantly conserving resources and avoiding the hassle of frequent testing for adverse events of a particular medication while trying it out on a patient.

In this review, we aim to explore the applications of ML models in the prediction of response to treatment in rheumatic diseases, which may potentially aid in identifying novel biomarkers and traditional risk factors. In addition, it can possibly be applied in conceptualising and building more robust algorithms and formulating guidelines for better application and integration of technological prowess to bridge the gap in achieving complete remission.

Search Strategy

We conducted a thorough search of PubMed/MEDLINE, WebOfScience and Scopus with the medical search heading (MeSH) terms ‘machine learning’ AND ‘arthritis, rheumatoid’ OR ‘lupus erythematosus, systemic’ OR ‘spondylarthritis’ OR ‘vasculitis’ OR ‘Sjogren’s syndrome’ OR ‘scleroderma, systemic’ OR ‘reactive arthritis’ OR ‘vasculitis’ in various combinations. We also included the relevant articles that were cross-referenced from these. We did not specify a time period for article inclusion. We included only the papers published in English. Conference abstracts were excluded. The final review was written adhering to the guidelines and the standard framework to write a narrative review.⁶

Discussion

ML is emerging as an apparently accurate tool for an expansive range of applications in healthcare and rheumatology. Most of the research and evidence for the applications of ML models in precision medicine have emerged from rheumatoid arthritis (RA), with recent evidence surfacing from systemic lupus erythematosus (SLE) and spondyloarthritis (SpA) too. Though ML models abrogate the issues with analysis faced due to small sample sizes in the traditional statistical methods to some extent, they are plagued by other problems like dimensionality. This is of value in precision medicine while exploring big data from -omics in the presence of a large number of variables, which also results in higher redundancy.⁷ But owing to the costs and inadequate funding, especially in developing countries like India, obtaining larger samples may not be feasible. In such scenarios, multiple variables may be checked to build a model with the best-performing set of variables.⁸ While most treatment is governed by standard guidelines, it may not be the most optimal. With precision medicine and integration of ML to identify the right variables or the biomarkers to predict response to a particular drug, there is improved potential to optimise therapy, minimise toxicity and translate to better clinical outcomes.

While supervised ML models have found the most use in the prediction of response to therapy and in precision medicine, some studies have also explored semi-supervised models in treatment decisions in RA.^9,10 Depending on clinical utility, baseline data availability and its heterogeneity, and the outcome expected the decision to use either can be made. One study by Morid et al. demonstrated that semi-supervised models fared better than supervised models in predicting the group of patients that would eventually need step-up therapy in RA.¹¹

The different ML models used in precision medicine and the different outcome variables and the best model adopted for prediction of response to treatment are summarised in the tables below.

Rheumatoid Arthritis

Most evidence for the use of ML models in precision rheumatology stems from RA. Tumour necrosis factor-α (TNF-α) is the prime disease-driving cytokine in RA. Toll-like receptors (TLR) by inducing TNF-α have a role in the pathogenesis, and their polymorphisms, by virtue of their influence on TNF production serve as potential markers to predict response to anti-TNF and have been attractive targets to study in precision medicine, having demonstrated a role in the pathogenesis.^12–14 While studying TLR polymorphisms is a novel step in itself in precision, integrating it with ML would further strengthen the predictive accuracy for remission.¹⁵ RETN gene (coding for resistin) polymorphisms is another attractive target studied in the field of precision medicine in RA. ML models integrating RETN gene polymorphisms with the sex of the patient and other clinical factors created a robust model to predict remission in patients on anti-TNF, with the male sex favouring remission.¹⁶

Though the practice of rheumatology may be governed, rather guided by treatment guidelines issued by different bodies such as the American College of Rheumatology (ACR), European League Against Rheumatism (EULAR), Asia Pacific League Against Rheumatism (APLAR) among the notable few, real-world practice may be far from ideal.^17,18 The debunking of the apparent notion of one-size-fits-all with the deeper research into precision medicine has led to trial of therapeutics that deviate from these ‘guidelines’. However, robust evidence may not support this as randomised control trials take time to be formulated and most supportive data is from observational real-life cohorts. ML models have proven useful in such settings, for instance, to integrate variables and real-world data on tocilizumab monotherapy to provide robust evidence and build and validate a prediction model for remission with tocilizumab monotherapy in RA.^19,20

Most evidence on the implementation of ML in guiding therapeutics in RA has been with anti-TNF (summarised in Table 1). While most of the studies aimed at building models integrating clinical, real-world, and biochemical data for predictive accuracy, there were some that integrated multiomics, generating more robust prediction tools. A Swedish group integrated variables from gene expression that predicted response to anti-TNF with transcriptomics, which made the predictive accuracy more robust, integrating this data with the clinical data and the transcriptome, proteome and the metabolome predicting response or lack thereof, to anti-TNF with a higher accuracy as compared to the gene transcriptions alone. The advantage of this particular model was the benefit obtained from it to predict unresponsiveness before initiating therapy that would have a huge cost benefit and reduced wastage. Predictively, precision medicine had the upper hand here, with the models integrating transcriptomic data having a higher predictive accuracy as compared to the ones integrating clinical data.²¹

Table 1.

Summary of Studies Exploring Machine-learning Models Predicting Response to Different Therapies and Outcome in Rheumatoid Arthritis.

Authors	Drug(s)	Machine-learning Models Used	Predictors of Response to Treatment	Outcome
Guan et al.²⁴	TNF inhibitors	Gaussian progression regression model (GPR)	DAS28, SNPs, demographic data	GPR model was effective in predicting response to anti-TNF based on the predictor outcomes.
Tao et al.²⁵	Adalimumab and etanercept	Random forest algorithms	Methylation of DNA and/or gene expression pattern profiling on PBMCs, monocytes, CD4+ T cells	The study shows that ML models predicated on molecular signatures can precisely forecast response prior to Adalimumab and Etanercept treatment, opening the door for personalised anti-TNF therapy.
Bouget et al.²⁶	TNF inhibitors	Linear regression, random forest, XGBoost and CatBoost	Disease activity scores: EULAR response, DAS28. Lab measures: ALT, neutrophils and lymphocytes; Baseline clinical data such as age, weight, and history of smoking	By using data from clinical routines, the ML models enabled prediction of patient’s response to TNFi successfully.
Rehberg et al.²⁷	Sarilumab	GUIDE algorithm: ‘Generalised, Unbiased, Interaction Detection and Estimation’	ACR20, ACR50, and ACR70 at Week 24	With the use of ML in the study, a straightforward selection criterion, ACPA positive and CRP of more than 12.3 mg/L, was discovered to help identify individuals who would be more likely to respond clinically to sarilumab.
Koo et al.²⁸	ETN, ADA, GOL, IFX, TCZ, abatacept	LASSO, XGBoost, ridge, SVM, random forest	DAS28-ESR £ 2.6	Using the clinical features in the DAS28 score, the ML model could successfully predict remission with the bDMARDS.
Chen et al.²⁹	Anti-TNF	Stacked-Ensemble DRP	DAS28	The proposed ML approach showed potential in assisting therapeutic decisions by offering a comprehensive pipeline to predict activity and identify the patients with inadequate response to TNF inhibitors.
Lee Jin Lim et al.³⁰	Methotrexate	Neural network, random forest, SVM, elastic net and logistic regression models. Random forest model fared best.	56 potentially functional single nucleotide polymorphisms (pfSNPs) and five clinical and laboratory factors – age, early morning stiffness, number of children, Hb, platelet count	Variables that were identified in the study included 56 pfSNPs and five non-genetic factors which may help decision-making for treatment of RA patients with the aid of ML models.
Gosselt et al.³¹	Methotrexate	Logistic regression, LASSO, random forest, and extreme gradient boosting (XGBoost)	DAS28-ESR > 3.2	LR fared as well as other ML models in predicting inadequate response to MTX .
Miyoshi et al.³²	Infliximab	Multilayer perceptron algorithm (neural network)	9 clinical variables – ESR, tender joint count (28 joints), serum albumin, serum monocyte numbers, RBC count, prednisolone dose, methotrexate dose, HbA1c, biologics used before infliximab	The algorithm was fairly successful in predicting response to IFX with a sensitivity and specificity of 96.7% and 75%, respectively.
Kim et al.¹⁵	Anti-TNF	Multivariate logistic regression and elastic net	Polymorphisms in TLR4, TLR9	ML models established an association between TLR9 polymorphism (rs352139) and response to treatment in individuals with RA on TNF inhibitors.
Johansson et al.¹⁹	Tocilizumab	Logistic regression and random forest	Derivation of data from real word registry (Corrona RA registry) and four RCTs – ACT-RAY, FUNCTION, ADACTA, AMBITION (CDAI, moderate disease activity, PGA)	ML models were able to derive a prediction model for the use of TCZ in RA integrating data from RCTs and real world.
Prasad et al.³³	anti-TNF	ML based classifier ATRPred (‘anti-TNF treatment response predictor’)	DAS28	ML based classifier ATRPred predicted response to TNF inhibitors with a sensitivity of 75%, specificity of 86% and an accuracy of 81% in patients with RA.
Yoosuf et al.³⁴	Anti-TNF	Linear, non-linear and kernel-based models	Transcriptomic data (higher expression of gene EPPK1), clinical data and FACS data	ML models using data from transcriptomics predicted response to anti-TNF more accurately than models using clinical data.
Kim et al.¹⁶	Anti-TNF	Random forest, elastic net	Age, sex, being hypertensive, history of intake of SSZ, and, rs1862513, rs3219178, rs3219177, and rs3745369 SNPs	Elastic net algorithm fared better for predicting remission to anti-TNF in RA. Better response was observed in T-allele carriers of rs3219177 and males.
Duong et al.³⁵	Methotrexate	LASSO, random forest	DAS28-ESR	The variables that predicted the best response to treatment were DAS28-ESR£7.4, anti-CCP positivity, and HAQ£2. Of those who had a DAS28-ESR>3.2, an improvement of ³1 from baseline to the 12^th week predicted achievement of low disease activity or remission at 24 weeks.
Luque-Tévar et al.³⁶	Anti-TNF	Regularised logistic regressions	SJC, TJC, DAS28, CDAI, SDAI, HAQ	Using clinical and molecular profiles, machine learning models were able to identify novel signatures as predictors of response to anti-TNF therapy.
Myasoedova et al.³⁷	methotrexate	Random forests	Age, sex, history of smoking, RF, DAS28 at baseline and 160 SNPs	In individuals with early rheumatoid arthritis, the combination of pharmacogenomic biomarkers and DAS28 at baseline more accurately predicted the response to MTX using EULAR criteria at three months, compared to relying solely on demographics and DAS28.
KalweitI et al.³⁸	Targeted synthetic DMARDS and biological DMARDS	deep learning	DAS28-ESR	Among patients with RA, those who frequently used conventional DMARDs, males and those with lower disease activity exhibited better responses to TCZ compared with ADA. Conversely, seronegative women who did not use prednisone during advanced RA treatment initiation, as well as seropositive women with higher disease activity and longer disease duration, faced a higher risk of non-response when treated with GOL as opposed to ADA.
Vodencarevic et al. ³⁹	bDMARDs	Logistic regression, k-nearest neighbours, naïve Bayes classifier and random forests	Best predictor of flare: dose percentage change. Next best-performing variables: DAS28-ESR, ESR, duration of disease, CRP, duration of remission at recruitment.	Integrating ML models and quality data from RCTs could predict individual flare in individuals with RA who were in remission, especially while tapering bDMARDs.

Note: ACR, American college of rheumatology; ACPA, anti-citrullinated peptide antibody; ADA, adalimumab; ALT, alanine aminotransferase; AST, aspartate aminotransferase; CCP, cyclic citrullinated peptide; CDAI, Clinical disease activity index; CRP, C-Reactive protein; DAS28, disease activity score-28; csDMARD, conventional synthetic disease-modifying antirheumatic drug; tsDMARD, targeted synthetic disease-modifying antirheumatic drug; ETN, etanercept; EULAR, European league against rheumatism; ESR, erythrocyte sedimentation rate; FACS, fluorescence activated cell sorting; GOL, golimumab; HAQ, Health assessment questionnaire; IFX, infliximab; LASSO - least absolute shrinkage and selection operator; LR, logistic regression; MTX, methotrexate; NET, neutrophil extracellular trap; PBMC, peripheral blood mononuclear cells; PGA, physician global assessment; RBC, red blood cell; RCT, randomised controlled trial; RF, Rheumatoid factor; SDAI, Simple disease activity index; SJC, swollen joint count; SNP, single nucleotide polymorphisms; SVM, support vector machine; TCZ, tocilizumab; TJC, Tender joint count; TNF, tumour necrosis factor.

Beyond integrating genetic and multiomics to predict therapeutic response, ML models have also been used to integrate and consolidate data from various randomised controlled trials (RCTs), where meta-analysis is not available. Though methotrexate is the most used drug in RA, the response to therapy may not be homogenous. ML models can integrate and consolidate easily available data such as routine clinical information, rheumatoid factor, anti-cyclic citrullinated peptide antibody (anti-CCP), disease activity scoring, quality of health assessment, which can greatly aid the treating physician in clinical decision-making at baseline regarding methotrexate monotherapy.²²

However, the limitation with most of the current studies is the small sample size and they need to be validated in larger cohorts before implementing in real-world practice.

Beyond monitoring response to treatment and predictions, ML models have also been used to predict commonly associated complications with RA like osteoporosis, which is more common with late-onset RA.²³

Spondyloarthritis

Though most evidence for ML in precision medicine is derived from RA, there is some emerging evidence in spondyloarthritis spectrum of disorders, mainly psoriatic arthritis (PsA). In contrast to studies in RA, these studies mostly consolidated clinical data to devise models for prediction, but the genetic basis for response to treatment is still left unexplored. Compared to RA, the disease drivers and the pathogenetic factors in PsA are more multifaceted, and targeting a single cytokine or gene editing may not result in robust disease control. Precision medicine and personalised medicine have a greater role to play in such diseases, where determining the exact pathway that is dominant, has a great translational relevance. Also in such scenarios, ML has an advantage as the pathogenesis here is a complex interplay of multiple pathways and networks of cytokines.

Evidence in PsA is mostly limited to secukinumab where one group has explored the factors predicting remission to the drug, while the other tried to determine the set of patients that would respond to a starting dose of 150mg versus 300mg as this is a common dilemma encountered in the clinic.⁴⁰ While most of these decisions are left to the discretion of the treating physician, employment of ML models provides evidence with high predictive accuracy. PsA is a disease with heterogenous presentations ranging from predominantly cutaneous psoriatic phenotype or peripheral deforming arthritis or a predominant axial disease.⁴¹ Owing to the non-homogeneity in presentation, the first choice of DMARD may not always be the right one. Additionally, the options of biological therapy also include anti-TNF and IL-17 and at baseline, and predicting the response to a particular drug is not foolproof. ML models can greatly aid in this treatment decision and result in saving time and finances by avoiding the ‘trial-and-error’. In patients with inadequate response to anti-TNF, those with early PsA and that had enthesitis at baseline were predicted to have remission with secukinumab and also 300mg fared better than 150mg in those treated without concomitant methotrexate, and with PsO.⁴²

One conflicting evidence in this regard comes from a study in ankylosing spondylitis (AS), where Lee et al. found no benefit of implementing ML models over traditional logistic regression (LR) model in predicting response to bDMARD in AS. In the same study, they found benefit in a random forest model in RA patients; however, ML failed to fare better than LR in AS.⁴³ The same group subsequently presented an artificial neural network model that integrated demographic and lab data to predict the patients that would require TNFi within six months of diagnosis in AS.⁴⁴

The studies exploring ML models in therapeutic response in SpA are summarised in Table 2.

Table 2.

Machine-learning Models Predicting the Response to Treatment and Outcome in Spondyloarthritis.

Authors	Disease	Drug(s)	ML Models Used	Predictors of Response to Treatment	Outcome
Lee et al.⁴³	Ankylosing spondylitis	bDMARDs	Random Forest, XGBoost, ANN, SVM, logistic regression (conventional statistics)	For RA: PtGA, RAPID3, SJC For AS: BASFI, BASDAI	In RA, random forest model performed better in predicting response to bDMARDs, compared to conventional statistics like logistic regression. However, in AS, there was no difference.
Lee et al.⁴⁴	Ankylosis spondylitis	Anti-TNF	ANN, XGBoost, random forest. Logistic regression (conventional statistics) for comparison.	ESR, CRP were the most important variables to distinguish the patients who used anti-TNF early.	Only the ANN model fared better than conventional statistics and LR.
Gotlieb et al.⁴²	Psoriatic arthritis	Secukinumab	Bayesian elastic net	Enthesitis at baseline, DAS28-CRP, baseline CRP levels, baseline BSA with psoriasis (³3%, <10%, and ³10%), previous usage of anti-TNF and concomitant usage of MTX	SEC 300mg was more effective than 150mg in D2T active PsA.
Venerito et al.⁴⁰	Psoriatic Arthritis	Secukinumab	XGBoost, LR	DAPSA, LEI and comorbidities	While a ML approach can potentially identify responders to SEC, those with a higher burden of disease, with axial symptoms and concomitant FM were D2T.
Jia et al.⁴⁵	AOSD	Glucocorticoids	SVM	Circulating NETs	Signature of circulating neutrophil extracellular traps provides additional value in monitoring these patients, including prediction of low dose steroid-unresponsive disease.

Note: ACR, American College of Rheumatology; ANN, artificial neural network; AOSD, Adult-onset Still’s disease; BASDAI, Bath ankylosing spondylitis disease activity index; BASFI, Bath ankylosing spondylitis functional index; bDMARDs, Biological disease-modifying antirheumatic drugs; BSA, Body surface area; CRP, C-reactive protein; D2T, difficult-to-treat; DAPSA, Disease activity in psoriatic arthritis; ESR, erythrocyte sedimentation rate; FM, fibromyalgia; IR, inadequate response; LEI, Leeds enthesitis index; MTX, methotrexate; NET, neutrophil extracellular traps; PtGA, Patient general assessment; RAPID3, Routine assessment of patient index data 3; SEC, secukinumab; SJC, swollen joint count; SVM, Support Vector Machine; TNF, Tumour necrosis factor; XGBoost, extreme gradient boosting.

Connective Tissue Diseases

A novel insight into precision medicine and ML in SSc was provided by BK Mehta et al., where they hypothesised that patients of early diffuse SSc with an inflammatory phenotype would have the best response to abatacept and this would depend on the CD28 reactome.⁴⁶ Integrating data from the molecular signature patterns of the skin, that is, the inflammatory pattern and the CD28 pathway, which is the one directly affected by abatacept, it was demonstrated that the patients with early disease and an inflammatory phenotype had the best cutaneous response and improvement in mRSS to abatacept.

The researchers from Japan in their landmark DesiReS trial, demonstrated benefit of rituximab on skin fibrosis in systemic sclerosis.⁴⁷ However, the therapeutic armamentarium of SSc-related skin fibrosis is wide, and choosing the right agent avoids the hassle of failures and adverse events. The causal tree ML model was implemented in a post-hoc analysis of DesiReS that aided in accurately predicting the set of patients who would have the best response to RTX, by combining clinical and immunological markers to find the optimal predictors⁴⁸ (Table 3). Clinical decisions guided by these factors can greatly influence therapeutics and avoid polypharmacy and unnecessary trial to multiple immunosuppression and potentially prevent adverse events and infections.

Table 3.

Machine-learning Models Predicting Response to Treatment and Outcome in Other Connective Tissue Diseases and Vasculitis.

Authors	Disease	Drug(s)	ML Models Used	Predictors of Response to Treatment	Outcome
Ebata et al.⁵⁰	Systemic sclerosis	Rituximab	Causal tree	Best: mRSS³17, CD19³57/μL 2^nd best: mRSS<17, CD19³57/μL, Serum SP-D³151/ng/μl	Patients with mRSS³17, CD19³57/μL are predicted to have best cutaneous response to RTX.
Mehta et al.⁴⁶	Early diffuse cutaneous systemic sclerosis	Abatacept	Support vector machine	The inflammatory phenotype among the intrinsic gene expression subsets from skin biopsy and CD28 reactome	Improvement in mRSS with abatacept correlated with baseline expression of Co-stimulation of CD28 pathway
Ayoub et al.⁴⁹	Lupus nephritis	Abatacept, Rituximab	LR, CART, Random Forrest, SVM with linear, polynomial, radial basis kernels (SVML, SVMP, and SVMR, respectively)	Best-performing model was the SVM with linear kernel containing age, race, eGFR and urine PCR at baseline and the urinary biomarkers – CXCL8, pentraxin, MCSF, and adiponectin	Integrating novel biomarkers and traditional clinical data can predict response to therapy at one year in lupus nephritis.
Danieli et al.⁵¹	Inflammatory idiopathic myopathy	IVIg and 20% SCIg	LASSO, Random Forest, Ridge, Elastic Net, Classification and Regression Trees	Serum creatine kinase, MMT8 to assess power, MITAX for disease activity, HAQ-DI for disability	Elastic Net was among the best-performing ML models to predict outcomes in myositis, both disease activity and disability.
Wang et al.⁵²	Kawasaki disease (KD)	IVIG	Logistic regression, decision tree, random forest, Adaboost, GBM, lightGBM	Top predictors: platelet count, blood calcium, A/G ratio, body weight, duration of the febrile episode before admission, total bilirubin, cholesterol	GBM was the best-performing model that could predict the refractoriness to IVIG in KD. Can potentially find use in analysing EHR data to aid future clinical decision-making

Note: A/G, albumin:globulin ratio; CART, classification and regression trees; EHR, electronic health record; GBM, gradient boosting machine; GFR, glomerular filtration rate; HAQ-DI, health assessment questionnaire-disability index; IVIG, intravenous immunoglobulin; LASSO, least absolute shrinkage and selection operator; LR, logistic regression; mRSS, modified Rodnan skin score; MMT, manual muscle test; MITAX, myositis intention to treat index; PCR, protein creatinine ratio; RTX, rituximab; ScIg, subcutaneous immunoglobulin; SP-D, surfactant protein-D; SVM, support vector machine.

Application of ML in precision medicine in lupus has been surprisingly scarce. We could find one study analysing data from existing cohorts to train and a longitudinal cohort of flare of SLE to validate the model integrating some novel biomarkers and clinical features to build a model to predict response to therapy, mostly abatacept and rituximab (summarised in Table 3).⁴⁹ However, the possible fallacy of implementing ML models in lupus may be the sheer number of markers, in routine clinical use or in research, that are available to monitor disease or treatment. It is imperative and ideal to check a combination of different markers to find the best-performing model for predictive accuracy, and these associations may not be known, or easily predictable in retrospect.

The other models used in isolated disease scenarios are summarised in Table 3.

Limitation of Current Evidence

Though there are many advantages of implementation of ML in health care and precision medicine such as robust analysis of big data, better predictive accuracy with a relative smaller sample size, identification of novel biomarkers for prediction and prognosis, these models are not without their limitations. Most of the cohorts where these models have been devised are small and they need to be validated in larger, preferably population-based studies for more robust evidence. While individual studies claim different biomarkers as predictors of response, there is clear lack of generalizability, which may lead to spurious claims of discovery of novel biomarkers.⁷

Moreover, there are limited data on the precision of these models, and if the findings of a model can be replicated in other independent models, then there needs to be a validation in real-life scenarios where the utilisation of ML leads to better patient outcomes. However, these can be done only once the models are established based on high-quality, longitudinal data.

While countries such as India have high patient loads, the health-care workers-to-patient ratios are not conducive to allow collection of comprehensive, high-quality data.⁵³

Conclusion

Attaining the goals of precision medicine will be difficult without the application of ML. Understanding the ‘systems biology’ of each disease and identifying the right set of biomarkers and clinical variables to predict the course and response can potentially save time, and reduce damage accrual by bypassing the trial of traditional/conventional treatment that may not be precise in the particular patient, have cost benefits and eventually translate to better outcomes. However, issues with lack of generalizability exist, leaving researchers and clinicians to ponder if every novel biomarker or predictor is to be taken at face value.

The current challenge lies in collecting quality longitudinal data and the application of robust ML models that can be replicated and validated in population-based studies, beyond the small, isolated cohorts.

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Ethical Approval

Not applicable.

Funding

The authors received no financial support for the research, authorship and/or publication of this article.

Informed Consent

Not applicable.

References

Ahmed

, Gupta

. A clinical aid to precision medicine. Indian J Rheumatol. 2019;14(2):98. doi: 10.4103/injr.injr_66_19

Ahmed

, Gasparyan

, Zimba

. Comorbidities in rheumatic diseases need special consideration during the COVID-19 pandemic. Rheumatol Int. 2021;41(2):243–256. doi: 10.1007/s00296-020-04764-5

Wampler

Muskardin TL

, Paredes

, Appenzeller

, Niewold

. Lessons from precision medicine in rheumatology. Mult Scler. 2020;26(5):533–539. doi: 10.1177/1352458519884249

Sarker

. Machine learning: algorithms, real-world applications and research directions. SN Comput Sci. 2021;2(3):160. doi: 10.1007/s42979-021-00592-x

Azzolina

, Baldi (University of Padova)

, Barbati

, . Machine learning in clinical and epidemiological research: isn’t it time for biostatisticians to work on it?. ebph. 2022;16(4). doi: 10.2427/13245

Gasparyan

, Ayvazyan

, Blackmore

, Kitas

. Writing a narrative biomedical review: considerations for authors, peer reviewers, and editors. Rheumatol Int. 2011;31(11):1409–1417. doi: 10.1007/s00296-011-1999-1993

Plant

, Barton

. Machine learning in precision medicine: lessons to learn. Nat Rev Rheumatol. 2021;17(1):5–6. doi: 10.1038/s41584-020-00538-2

Guyon

, Elisseeff

. An Introduction to Variable and Feature Selection.

Morid

, Lau

, Del Fiol

. Predictive analytics for step-up therapy: supervised or semi-supervised learning? J Biomed Inform. 2021;119:103842. doi: 10.1016/j.jbi.2021.103842

10.

Gheita

, Hammam

. Machine learning in rheumatology: the emerging cutting-edge strategy. Int J Clin Rheumatol. 2023;18(4):72–74. doi: 10.37532/1758-4272.2023

11.

Morid

, Lau

, Del Fiol

. Predictive analytics for step-up therapy: supervised or semi-supervised learning? Journal of Biomedical Informatics. 2021;119:103842. doi: 10.1016/j.jbi.2021.103842

12.

Huang

, Pope

. The role of toll-like receptors in rheumatoid arthritis. Curr Rheumatol Rep. 2009;11(5):357–364. doi: 10.1007/s11926-009-0051-z

13.

Goh

, Midwood

. Intrinsic danger: activation of toll-like receptors in rheumatoid arthritis. Rheumatology. 2012;51(1): 7–23. doi: 10.1093/rheumatology/ker257

14.

Falvo

, Tsytsykova

, Goldfeld

. Transcriptional control of the TNF Gene. In: Kollias

, Sfikakis

, eds. Current Directions in Autoimmunity. Vol 11. KARGER; 2010:27–60. doi: 10.1159/000289196

15.

Kim

, Kim

, Oh

, . Association of TLR 9 gene polymorphisms with remission in patients with rheumatoid arthritis receiving TNF-α inhibitors and development of machine learning models. Sci Rep. 2021;11(1):20169. doi: 10.1038/s41598-021-99625-x

16.

Kim

, Jin

Oh S

, Thi

Trinh N

, . Effects of RETN polymorphisms on treatment response in rheumatoid arthritis patients receiving TNF-α inhibitors and utilization of machine-learning algorithms. Int Immunopharmacol. 2022;111:109094. doi:10.1016/j.intimp.2022.109094

17.

Lau

, Chia

, Dans

, . 2018 update of the APLAR recommendations for treatment of rheumatoid arthritis. Int J Rheum Dis. 2019;22(3):357–375. doi: 10.1111/1756-185X.13513

18.

Smolen

, Landewé

RBM

, Bergstra

, . EULAR recommendations for the management of rheumatoid arthritis with synthetic and biological disease-modifying antirheumatic drugs: 2022 update. Ann Rheum Dis. 2023;82(1):3–18. doi: 10.1136/ard-2022-223356

19.

Johansson

, Collins

, Yau

, . Predicting response to tocilizumab monotherapy in rheumatoid arthritis: a real-world data analysis using machine learning. J Rheumatol. 2021;48(9):1364–1370. doi: 10.3899/jrheum.201626

20.

Collins

, Johansson

, Gale

, . Predicting remission among patients with rheumatoid arthritis starting tocilizumab monotherapy: model derivation and remission score development. ACR Open Rheumatol. 2020;2(2):65–73. doi: 10.1002/acr2.11101

21.

Yoosuf

, Maciejewski

, Ziemek

, . Early prediction of clinical response to anti-TNF treatment using multi-omics and machine learning in rheumatoid arthritis. Rheumatology. 2022;61(4):1680–1689. doi: 10.1093/rheumatology/keab521

22.

Duong

, Crowson

, Athreya

, . Clinical predictors of response to methotrexate in patients with rheumatoid arthritis: a machine learning approach using clinical trial data. Arthritis Res Ther. 2022;24(1):162. doi: 10.1186/s13075-022-02851-5

23.

Chen

, Huang

, Chen

. Development and validation of machine learning models for prediction of fracture risk in patients with elderly-onset rheumatoid arthritis. IJGM. 2022;15:7817–7829. doi: 10.2147/IJGM.S380197

24.

Guan

, Zhang

, Quang

, . Machine learning to predict anti-tumor necrosis factor drug responses of rheumatoid arthritis patients by integrating clinical and genetic markers. Arthritis Rheumatol. 2019;71(12):1987–1996. doi: 10.1002/art.41056

25.

Tao

, Concepcion

, Vianen

, . Multiomics and machine learning accurately predict clinical response to adalimumab and etanercept therapy in patients with rheumatoid arthritis. Arthritis Rheumatol. 2021;73(2):212–222. doi: 10.1002/art.41516

26.

Bouget

, Duquesne

, Hassler

, . Machine learning predicts response to TNF inhibitors in rheumatoid arthritis: results on the ESPOIR and ABIRISK cohorts. RMD Open. 2022;8(2):e002442. doi: 10.1136/rmdopen-2022-002442

27.

Rehberg

, Giegerich

, Praestgaard

, . Identification of a rule to predict response to sarilumab in patients with rheumatoid arthritis using machine learning and clinical trial data. Rheumatol Ther. 2021;8(4):1661–1675. doi: 10.1007/s40744-021-00361-5

28.

Koo

, Eun

, Shin

, . Machine learning model for identifying important clinical features for predicting remission in patients with rheumatoid arthritis treated with biologics. Arthritis Res Ther. 2021;23(1):178. doi: 10.1186/s13075-021-02567-y

29.

Chen

, Gupta

, Galbraith

, Shah

, Cirrone

. Prediction of drug effectiveness in rheumatoid arthritis patients based on machine learning algorithms. In: Proceedings of the 2022 9th International Conference on Biomedical and Bioinformatics Engineering. ICBBE ’22. Association for Computing Machinery; 2023:147–154. doi: 10.1145/3574198.3574221

30.

Lim

, Lim

AJW

, Ooi

BNS

, . Machine learning using genetic and clinical data identifies a signature that robustly predicts methotrexate response in rheumatoid arthritis. Rheumatology (Oxford). 2022;61(10):4175–4186. doi: 10.1093/rheumatology/keac032

31.

Gosselt

, Verhoeven

MMA

, Bulatović-Ćalasan

, . Complex machine-learning algorithms and multivariable logistic regression on par in the prediction of insufficient clinical response to methotrexate in rheumatoid arthritis. J Pers Med. 2021;11(1):44. doi: 10.3390/jpm11010044

32.

Miyoshi

, Honne

, Minota

, Okada

, Ogawa

, Mimura

. A novel method predicting clinical response using only background clinical data in RA patients before treatment with infliximab. Mod Rheumatol. 2016;26(6):813–816. doi: 10.3109/14397595.2016.1168536

33.

Prasad

, McGeough

, Eakin

, . ATRPred: A machine learning based tool for clinical decision-making of anti-TNF treatment in rheumatoid arthritis patients. PLoS Comput Biol. 2022;18(7):e1010204. doi: 10.1371/journal.pcbi.1010204

34.

Yoosuf

, Maciejewski

, Ziemek

, . Early prediction of clinical response to anti-TNF treatment using multi-omics and machine learning in rheumatoid arthritis. Rheumatology. 2022;61(4):1680–1689. doi: 10.1093/rheumatology/keab521

35.

Duong

, Crowson

, Athreya

36.

Luque-Tévar

, Perez-Sanchez

, Patiño-Trives

, . Integrative clinical, molecular, and computational analysis identify novel biomarkers and differential profiles of anti-tnf response in rheumatoid arthritis. Front Immunol. 2021;12: 631662. doi: 10.3389/fimmu.2021.631662

37.

Myasoedova

, Athreya

, Crowson

, . Toward individualized prediction of response to methotrexate in early rheumatoid arthritis: a pharmacogenomics-driven machine learning approach. Arthritis Care Res. 2022;74(6): 879–888. doi: 10.1002/acr.24834

38.

Kalweit

, Burden

, Boedecker

, Hügle

, Burkard

. Patient groups in rheumatoid arthritis identified by deep learning respond differently to biologic or targeted synthetic DMARDs. Wilson J, ed. PLoS Comput Biol. 2023;19(6):e1011073. doi: 10.1371/journal.pcbi.1011073

39.

on behalf of the RETRO study group, Vodencarevic

, Tascilar

, . Advanced machine learning for predicting individual risk of flares in rheumatoid arthritis patients tapering biologic drugs. Arthritis Res Ther. 2021;23(1):67. doi: 10.1186/s13075-021-02439-5

40.

Venerito

, Lopalco

, Abbruzzese

, . A machine learning approach to predict remission in patients with psoriatic arthritis on treatment with secukinumab. Front Immunol. 2022;13:917939. doi: 10.3389/fimmu.2022.917939

41.

Coates

, Helliwell

. Psoriatic arthritis: state of the art review. Clin Med. 2017;17(1):65–70. doi: 10.7861/clinmedicine.17-1-65

42.

Gottlieb

, Mease

, Kirkham

, . Secukinumab efficacy in psoriatic arthritis: machine learning and meta-analysis of four phase 3 trials. J Clin Rheumatol. 2021;27(6):239–247. doi: 10.1097/RHU.0000000000001302

43.

Lee

, Kang

, Eun

, . Machine learning-based prediction model for responses of bDMARDs in patients with rheumatoid arthritis and ankylosing spondylitis. Arth Res Ther. 2021;23(1):254. doi: 10.1186/s13075-021-02635-3

44.

Lee

, Eun

, Kim

, Cha

, Koh

, Lee

. Machine learning to predict early TNF inhibitor users in patients with ankylosing spondylitis. Sci Rep. 2020;10(1):20299. doi: 10.1038/s41598-020-75352-7

45.

Jia

, Wang

, Ma

, . Circulating neutrophil extracellular traps signature for identifying organ involvement and response to glucocorticoid in adult-onset still’s disease: a machine learning study. Front Immunol. 2020;11:563335. doi: 10.3389/fimmu.2020.563335

46.

Mehta

, Espinoza

, Franks

, . Machine-learning classification identifies patients with early systemic sclerosis as abatacept responders via CD28 pathway modulation. JCI Insight. 2022;7(24):e155282. doi:10.1172/jci.insight.155282

47.

Ebata

, Yoshizaki

, Oba

, . Safety and efficacy of rituximab in systemic sclerosis (DESIRES): a double-blind, investigator-initiated, randomised, placebo-controlled trial. Lancet Rheumatol. 2021;3(7):e489–e497. doi: 10.1016/S2665-9913(21)00107-7

48.

Ebata

, Oba

, Kashiwabara

, . Predictors of rituximab effect on modified Rodnan skin score in systemic sclerosis: a machine-learning analysis of the DesiReS trial. Rheumatology. 2022;61(11):4364–4373. doi: 10.1093/rheumatology/keac023

49.

Ayoub

, Wolf

, Geng

, . Prediction models of treatment response in lupus nephritis. Kidney Int. 2022;101(2):379–389. doi: 10.1016/j.kint.2021.11.014

50.

Ebata

, Oba

, Kashiwabara

, . Predictors of rituximab effect on modified Rodnan skin score in systemic sclerosis: a machine-learning analysis of the DesiReS trial. Rheumatology (Oxford). 2022;61(11):4364–4373. doi: 10.1093/rheumatology/keac023

51.

Danieli

, Tonacci

, Paladini

, . A machine learning analysis to predict the response to intravenous and subcutaneous immunoglobulin in inflammatory myopathies. A proposal for a future multi-omics approach in autoimmune diseases. Autoimmun Rev. 2022;21(6):103105. doi: 10.1016/j.autrev.2022.103105

52.

Wang

, Liu

, Lin

. A machine learning approach to predict intravenous immunoglobulin resistance in Kawasaki disease patients: a study based on a Southeast China population. PLOS ONE. 2020;15(8):e0237321. doi: 10.1371/journal.pone.0237321

53.

Misra

, Agarwal

, Negi

. Rheumatology in India: a bird’s eye view on organization, epidemiology, training programs and publications. J Korean Med Sci. 2016;31(7):1013–1019. doi: 10.3346/jkms.2016.31.7.1013

Current Status of Machine Learning for Precision Rheumatology: Are We There Yet?

Abstract

Keywords

Introduction

Search Strategy

Discussion

Rheumatoid Arthritis

Spondyloarthritis

Connective Tissue Diseases

Limitation of Current Evidence

Conclusion

Footnotes

Declaration of Conflicting Interests

Ethical Approval

Funding

Informed Consent

References