Abstract
Objectives:
Differences in demographic factors, symptoms, and laboratory data between bacterial and non-bacterial arthritis have not been defined. We aimed to identify predictors of bacterial arthritis, excluding synovial testing.
Methods:
This retrospective cross-sectional survey was performed at a university hospital. All patients included received arthrocentesis from January 1, 2010, to December 31, 2020. Clinical information was gathered from medical charts from the time of synovial fluid sample collection. Factors potentially predictive of bacterial arthritis were analyzed using the Student’s t-test or chi-squared test, and the chi-squared automatic interaction detector decision tree analysis. The resulting subgroups were divided into three groups according to the risk of bacterial arthritis: low-risk, intermediate-risk, or high-risk groups.
Results:
A total of 460 patients (male/female = 229/231; mean ± standard deviation age, 70.26 ± 17.66 years) were included, of whom 68 patients (14.8%) had bacterial arthritis. The chi-squared automatic interaction detector decision tree analysis revealed that patients with C-reactive protein > 21.09 mg/dL (incidence of septic arthritis: 48.7%) and C-reactive protein ⩽ 21.09 mg/dL plus 27.70 < platelet count ⩽ 30.70 × 104/μL (incidence: 36.1%) were high-risk groups.
Conclusions:
Our results emphasize that patients categorized as high risk of bacterial arthritis, and appropriate treatment could be initiated as soon as possible.
Keywords
Introduction
Arthritis is a common chief complaint encountered by clinicians. The differential diagnoses of arthritis include infection, crystal-induced disease, systemic diseases, osteoarthritis, trauma, and a variety of other conditions. Bacterial arthritis has a high mortality rate and requires prompt initiation of antibiotic therapy. Some studies about bacterial arthritis, including studies of the mortality rate, have been reported.1–6 One study reported that 37% of patients with arthritis required surgical intervention and the mortality was 11%. An increased white blood cell count at presentation (p < 0.02) and the development of abnormal renal function (p < 0.015) were predictors of poor prognosis. 1 Another study reported an 11.5% mortality and a 31.6% morbidity in patients with bacterial arthritis. Multivariate analysis suggested that the important predictors of death are confusion at presentation, age ⩾65 years, multiple joint sepsis, and involvement of the elbow joint, and predictors of morbidity are age ⩾65 years, diabetes mellitus, open surgical drainage, and gram-positive infections other than Staphylococcus aureus. 2 In addition, clinicians should promptly consider the likelihood of bacterial arthritis when acute arthritis is suspected. Some studies have shown the diagnostic value of the history and physical examination for distinguishing bacterial arthritis from other diseases. For example, one study revealed that factors such as age older than 80 years (odds ratio (OR) = 3.5, 95% confidence interval (95% CI) = 1.8–7.0), diabetes mellitus (OR = 2.7, 95% CI = 1.0–6.9), rheumatoid arthritis (OR = 2.5, 95% CI = 2.0–3.1), recent joint surgery (OR = 6.9, 95% CI = 3.8–12.0), hip or knee prosthesis (OR = 3.1, 95% CI = 2.0–4.9), and skin infection (OR = 2.8, 95% CI = 1.7–4.5) increase the probability of bacterial arthritis. 7 Another study reported that joint pain, a history of joint swelling, and fever occur in more than 50% of patients with bacterial arthritis; however, no studies have revealed the specificity of these symptoms.8–12 Similarly, some studies have shown that peripheral white blood cell (WBC) count, erythrocyte sedimentation rate, and C-reactive protein (CRP) had high sensitivity but very poor specificity for bacterial arthritis.13,14. Thus, a diagnosis of bacterial arthritis is difficult based only on physical examination and blood test data. Therefore, when bacterial arthritis is suspected, arthrocentesis is generally performed and examined by synovial laboratory tests and culture, and antimicrobial agents are administered as soon as possible. Variables not inclusive of synovial laboratory tests and cultures such as demographic factors, underlying conditions, symptoms, vital signs, and laboratory data are thought to be highly clinically significant and useful for physicians. Based on the results from these variables, physicians could promptly start the appropriate treatment. Furthermore, to our knowledge, there are no studies which define predictive factors of bacterial arthritis in demographic factors, symptoms, and laboratory data using a chi-squared automatic interaction setector (CHAID) decision tree analysis model. From the results of CHAID, the physicians could predict bacterial arthritis and appropriate treatment could be initiated as soon as possible.
We compared clinical parameters using CHAID, with the goal of determining which patients were bacterial arthritis.
Patients and methods
Study design and study population
All methods were performed in accordance with the relevant guidelines and regulations. This study was approved by the Ethics Committee of Juntendo University Nerima Hospital, Tokyo, Japan (approval number: 2020067), and the requirement for informed consent was waived by the ethics committee. All patients who underwent joint puncture between January 1, 2010, and December 31, 2020, were included. The exclusion criteria were the patients who refused it by the opto-out notice by the Ethics Committee of Juntendo University Nerima Hospital. Using all data which we extracted, the missing values were described in table 1. This retrospective, cross-sectional study was performed at Juntendo University Nerima Hospital (a 490-bed, university-affiliated hospital), Tokyo, Japan. The primary outcome was the diagnosis of bacterial arthritis. Data from patients diagnosed with bacterial arthritis were retrospectively collected from the clinical laboratory database. The diagnosis of bacterial arthritis was based on synovial fluid and culture results, and at least two physicians were involved in the diagnosis.
Patient characteristics.
NA: not analyzed; BMI: body mass index; CI: confidence interval; SD: standard deviation.
p < 0.05.
Furthermore, we did not puncture by the judgment only for the physicians, and at least one orthopedists conducted the puncture. Synovial fluid and culture results were extracted by chart review along with other clinical information that was applied at the time of the arthrocentesis were reviewed. In addition, the blood test data were extracted, when physicians submitted at joint puncture.
Statistical analysis
This and other clinical information were extracted by chart review (based on the past precedence articles,15,16 we chose them): age, female sex, body mass index, under lying condition (cancer bearing, hemodyscrasia, diabetes mellitus, human immunodeficiency syndrome, and use of immunosuppressive agents), symptoms (chill, acute joint pain and multiple joint pain). We also extracted vital signs (disturbance of consciousness, axillary body temperature, systolic and diastolic blood pressure, heart rate, respiratory rate, and oxygen saturation), white blood cell count with percentages of neutrophils, hemoglobin, platelet, blood urea nitrogen (BUN), creatinine, albumin, total bilirubin, lactate dehydrogenase, aspartate aminotransferase, alanine aminotransferase, sodium, potassium, chloride, glucose, hemoglobin A1c, and CRP from the medical charts. Bivariate comparison of each variable between patients with bacterial and those with non-bacterial arthritis was performed using the independent t-test for continuous data or the chi-squared test for categorical data. Differences were considered significant when the p-value was below 0.05. The results were then subjected to chi-squared automatic interaction detector (CHAID) decision tree analysis to identify combinations of risk factors associated with bacterial arthritis. The dependent variable when performing the CHAID analysis was bacterial and non-bacterial arthritis.
CHAID decision tree analysis is a data mining technique17,18 with the salient advantage of advanced graphic presentation for interpretation. 19 CHAID enables investigation of all variables, partitions consecutive data effectively, and makes decision trees by using a forward stopping or pruning rule. 20 Moreover, CHAID is the only model for formulating multiple nodes. 19 Unlike other techniques, the significance level can be adjusted for the number of comparisons. CHAID decision tree analysis has been applied in the medical field21,22 and has been shown to be superior to logistic analysis. 23 Furthermore, in other study, there is the advantage that complex risks can evaluate than single significant index in a blood gas item. 24 For other advantages, results are plain, little pretreatment necessary, versatility is high for any data, and we can correspond to classification, both recurrences are cited.25,26 We thought about all these advantages compositely and chose CHIAD as a method of analysis in relation to multiple factors.
In addition, prediction rules with the CHAID model are visibly intuitive and easy to interpret in clinical settings. The mother and daughter nodes were set to 50 and 25, respectively. In addition, we analyzed it without cross validation. Multiple 2 × 2 contingency tables between the dependent variable and each independent variable were created; the most significant independent variable in a Chi-square test was then selected to branch out the decision tree. The categories of each independent variable were merged if they were not significantly different from the dependent variable and the cutoff values are established automatically by the chi-square test results.27–31 Furthermore, we add and express it in other words by the explanation of CHAID model from other precedence articles, the CHAID algorithm is a nonparametric procedure and, therefore, it required no assumptions to be made of the underlying data. Also, CHAID treats the entire system and user-missing values for each independent variable as a single category. For scale and ordinal independent variables, a given category may or may not subsequently be merged with other categories of that independent variable, depending on the growing number of criteria.
The receiver operating characteristic (ROC) curve analysis was run for the ratio of true-positive septic arthritis patients detected by the CHAID classification model to false-positive septic arthritis patients. The ROC curve analysis was used because it is considered a successful method for classifying patient groups. All statistical analyses were performed using SPSS software, version 22.
Results
As shown in Table 1, a total of 460 patients (male/female = 229/231; mean ± standard deviation age, 70.26 ± 17.66 years) were included, and 68 patients (14.8%) had bacterial arthritis. Table 1 also shows patient characteristics in the groups of individuals with bacterial arthritis and non-bacterial arthritis, along with the results of bivariate analysis. WBC count (p = 0.01), percentages of neutrophils (p = 0.01), BUN levels (p < 0.01), glucose levels (p < 0.01), hemoglobin A1c levels (p < 0.01), and CRP levels (p < 0.001) were observed at significantly higher frequencies among patients with bacterial arthritis than those with non-bacterial arthritis. On the contrary, albumin levels (p = 0.04) were observed at significantly lower frequency among patients with bacterial arthritis than those with non-bacterial arthritis.
A CHAID decision tree analysis was performed with all candidate predictors.17–23 The algorithm for predicting septic arthritis driven by CHAID is shown in Figures 1 and 2. When the decision tree diagram in Figure 2 was investigated, 68 (14.8%) of the bacterial arthritis patients included in the study and 392 (85.2%) were non-bacterial arthritis patients. The “CRP level” was found to be the most efficient predictor of the prognosis of the disease (χ2 = 45.467, p-value < 0.001; Figure 2). Accordingly, bacterial arthritis patients are divided into three different groups (one is missing value group) according to the CRP level variable whose cut-off value is determined. According to these findings, 21.090 and higher “CRP level” values increase the prognosis of bacterial arthritis patients significantly (48.7% versus 13.6%). Similarly,

The algorithm for predicting bacterial arthritis driven by chi-squared automatic interaction detector (CHAID). Categories were defined based on bacterial arthritis incidence values as follows: low-risk (⩽5%), intermediate-risk (>5% to ⩽20%), and high-risk (>20%) categories.

The classification success of the CHAID decision tree in this study.
Platelet count and percentages of neutrophils were included in the decision tree, and five terminal nodes were derived. In addition, constructing our decision tree with the “CRP level, platelet count and percentage of neutrophils” variables to determine the bacterial arthritis demonstrated the clinical accuracy of our interpretable decision tree. It is judged that the cutoff values mentioned above were clinically proper. Based on preliminary research using the CHAID decision tree analysis, the patients can be categorized into three risk groups: low-risk (⩽5%), intermediate-risk (>5% to ⩽20%), or high-risk (>20%) groups. 32 Based on the incidence, the nodes were sorted into low risk (incidence of septic arthritis: 4.3%), intermediate risk 1 (13.6%), intermediate risk 2 (19.7%), high risk 1 (48.7%), and high risk 2 (36.1%). Figure 3 shows the ROC curve derived from the CHAID decision tree analysis (sensitivity: 1-specificity, 0.279:0.051, 0.471:0.110, 0.691:0.265, 0.897:0.492, 0.985:0.832). Its AUC was 0.786 (95% CI: 0.730–0.843) and the accuracy rate was 85.2%.

Receiver operating characteristics curve of chi-squared automatic interaction detector (CHAID)-formulated decision tree for the positive risk factors for bacterial arthritis.
Discussion
To our knowledge, this is the first report using CHAID decision tree analysis in establishing which patients need joint puncture and immediate administration of antimicrobial agents. The patients with a CRP level >21.09 mg/dL (incidence of bacterial arthritis: 48.7%) and with a CRP level ⩽21.09 mg/dL plus 27.70 < platelet count ⩽ 30.70 × 104/μL (incidence of bacterial arthritis: 36.1%) were categorized as high risk. These patients should receive prompt joint puncture and immediate administration of antimicrobial agents.
CRP is widely used as a marker of infection. CRP is an acute-phase serum protein that plays a central role in the immunological response. It is considered an acute-phase protein that responds to inflammatory cytokines associated with activated monocytes or macrophages after infection. 33 Primarily, CRP is induced by the action of interleukin-6 (IL-6), which is responsible for CRP gene transcription. 34 In some cases, it can also activate the complement system, forming inflammatory cytokines, thereby further aggravating tissue damage. 35 Several studies have shown the usefulness of CRP in estimating the risk of bacterial infections.33–35 Our study results suggest that elevated CRP level is a predictor of septic arthritis. The point where the cutoff value of CRP was found using CHAID analysis was the most useful result of this study. Results suggest that prompt joint puncture and antimicrobial agent administration should be performed in patients with a CRP level >21.09 mg/dL, because these patients represented a high-risk group based on CHAID analysis in the present study. Similarly, CRP has been shown to be an important biomarker in severe cases of COVID-19 compared to mild cases. 36 In addition, the use of CRP feature in the highly accurate estimation of the diagnosis and prognosis of COVID-19 with artificial intelligence models shows the importance of this biomarker in the diagnosis of infectious diseases.37–42
Thrombocytosis is known to occur in patients with inflammatory diseases due to cytokine production. 43 Several studies have shown that chronic long-standing infections such as chronic osteomyelitis, pyogenic pulmonary infections, and active tuberculosis and collagen diseases such as rheumatoid arthritis, and malignancy can cause thrombocytosis.44,45 For example, thrombocytosis of articular rheumatism has been reported considerably in the past.46,47 Many factors, such as IL-6, IL-11, stem cell factor, leukemia inhibitory factor, granulocyte colony stimulating factor, thrombopoietin, and the regulation of megakaryocytopoiesis, during the inflammatory cascade have been reported to be associated with thrombocytosis. 48 An increased platelet count could be suggestive of chronic inflammatory disease. The above information supports the categorization of intermediate risk 1 (CRP level ⩽21.09 mg/dL plus platelet count >30.70 × 104/μL) found in our study.
Neutrophils are short-lived, and bone marrow production rate is astronomical, with an additional reserve pool of cells available. 49 Typically, only mature polymorphonuclear forms are present in the peripheral blood. Neutrophils are the most abundant leukocytes and play a key role in the immune defense against bacterial infections. 50 Moreover, our preliminary research showed that neutrophil count is significantly related to bacteremia, and multivariable logistic regression analysis revealed that a neutrophil percentage >80% (OR = 3.61, 95% CI: 1.71–8.00, p = 0.001) was an independent risk factor for positive blood culture results. 51 This may explain the division of the intermediate-risk 2 and low-risk categories in this study. Furthermore, for bacterial arthritis, the neutrophil levels of the affected part have been reported to be associated with prognosis. 52
Furthermore, based on our results, we want to emphasize that it is critical to completely exclude bacterial arthritis in intermediate-risk 1, intermediate-risk 2, and low-risk categories. In such cases, we feel that joint puncture and antimicrobial agent administration could be delayed as compared with the high-risk group. However, the low-risk and intermediate-risk groups also should receive blood cultures. From our preliminary research, blood cultures have become an important diagnostic clue for bacterial arthritis. 51 If a patient’s overall clinical status is poor, joint puncture and immediate antimicrobial agent administration also should be considered among these groups.
Many articles are published about the study using CHAID and we should compare the classification accuracy rates of the CHAID algorithm in various applications. In these articles, the classification accuracy rates were 81.6%, 53 87%, 54 81.0%, 55 and 79.0%. 56 The accuracy rate of this study was 85.2%. Moreover, the classification accuracy rate of this study is thought to be higher than the accuracy rates of other articles.54,57,58,59–61. Therefore, we evaluated the accuracy rate of this study was high.
There were some limitations to this study. Our hospital is a university hospital; our medical staff typically examine a large number of patients referred from general practitioners within the Japanese medical system. Thus, our patient population may have included cases with more serious arthritis. While the present study used a CHAID model, yet the risk may be evaluated using other techniques as well. Additionally, we did not calculate the sample size because all patients who underwent joint puncture between January 1, 2010, and December 31, 2020 were included. However, for the studies in the future, the sample size is to be calculated beforehand.
This study population of patients undergoing joint puncture is limited, it is a cause that there was not a clear objective puncture of joint criteria. However, at least two physicians were involved in the diagnosis, and we did not puncture by the judgment only for the physicians (at least one orthopedist conducted the puncture). We strongly believe that at least clinically unnecessary patients were underwent the puncture of joint. Furthermore, when we punctured the joint, each physician collected blood test date, and there is no consistency in the blood test values, thus there is a difference between the number of missing values.
We suspect that the missing values (e.g. one patient with CRP in Table 1) may have affected our CHAID analysis. However, CHAID treats the entire system and user-missing values for each independent variable as a single category. For scale and ordinal independent variables, a given category may or may not subsequently be merged with other categories of that independent variable, depending on the growing number of criteria. Thus, we believe that the effect of missing values on our results is minimized. Furthermore, we also suspect the missing data (e.g. albumin value) may have affected the univariate analysis, and we should argue about them again using a bigger population (with fewer missing values) in the future.
The patient population enrolled in this study was limited to a single hospital. In addition, this was a retrospective study. As a next step, a multicenter prospective study should be conducted with a larger number of patients.
Conclusion
The patients with a CRP level >21.09 mg/dL (incidence of bacterial arthritis: 48.7%) and with a CRP level ⩽21.09 mg/dL plus 27.70<platelet count ⩽30.70 × 104/μL (incidence of bacterial arthritis: 36.1%) were categorized as high risk. Our results emphasize that patients were categorized as high risk of bacterial arthritis and that appropriate treatment could be initiated as soon as possible.
Footnotes
Acknowledgements
We thank all doctors of the orthopedics department, Juntendo University Nerima Hospital, for their contribution to this study.
Author contributions
All the authors contributed to the study concept and design. SK and SF were involved in the acquisition of the participants and data. SK, SF, and DK were involved in the analysis and interpretation of the data. All the authors were involved in the preparation of the manuscript.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethics approval
Ethical approval for this study was obtained from the Ethics Committee of Juntendo University Nerima Hospital, Tokyo, Japan (approval number: 2020067).
Informed consent
The requirement for informed consent was waived by the ethics committee.
