Artificial intelligence supported professional prediction of Physical Activity Counseling Practices Scale in health professionals with a machine learning algorithm

Abstract

Objective

To evaluate whether machine learning algorithms can predict healthcare professionals’ occupations (physiotherapist, nurse, and dietitian) from PACPS (Physical Activity Counseling Practices Scale) item responses.

Methods

We conducted a cross-sectional study in Konya (January–April 2025) with 242 participants. Five algorithms (Random Forest, K-Nearest Neighbors, Support Vector Machine, Decision Tree, and Naive Bayes) were trained and evaluated as follows: we first performed a stratified 70:30 split of the original dataset (train n = 126, test n = 116). Data augmentation was then applied only to the training set to address class imbalance, increasing it to n = 269, while the test set remained untouched (n = 116), preserving an effective ≈70:30 ratio. Performance was assessed on the independent test set (n = 116) using accuracy, precision, recall, and F1-score. Random Forest feature importance was examined to aid interpretability.

Results

On the test set (n = 116), accuracies were 0.76 (Support Vector Machine), 0.82 (K-Nearest Neighbors), 0.71 (Naive Bayes), 0.75 (Random Forest), and 0.67 (Decision Tree). Random Forest identified PACPS12 as the most informative item for discrimination among occupations.

Conclusion

PACPS responses contain distinctive patterns that enable moderate occupation prediction, with SVM and KNN yielding the best generalization in this small-sample setting. These results support the feasibility of combining a clinically grounded scale with machine learning methods, while underscoring the need for larger and externally validated datasets before clinical implementation.

Keywords

Machine learning K-Nearest Neighbors Random Forest Support Vector Machine physical activity counseling health professionals

Introduction

Physical activity counseling for patients in primary healthcare is an important component of health promotion.¹ Randomized controlled trials have shown that adults of all genders can increase their physical activity and fitness after receiving counseling in primary healthcare.² Physical activity counseling has been found to be effective in increasing physical activity levels with regular follow up, community support, and referrals for physical activity. In addition, the integration of physical activity counseling and referral programs into primary healthcare teams was found to be cost-effective.³

While most clinical guidelines assign physicians the responsibility for physical activity counseling, the recent Making Every Contact Count strategy extends the responsibility for promoting positive behavioral changes to all healthcare professionals. Therefore, it is important to note that physical activity counseling should be at the center of medical consultation. While there is some literature on general practitioners “use of brief interventions to increase patients” physical activity in primary healthcare, there is a lack of evidence on the physical activity counseling practice of hospital-based health professionals.^3,4 The Physical Activity Counseling Practices Scale (PACPS), was developed to facilitate the examination.⁵ PACPS, which we developed in our previous study, determines the extent to which health professionals perform physical activity counseling. The scale is a Likert-type scale, where a high score indicates a good level of physical activity counseling.

Artificial intelligence is an interdisciplinary field of research that has recently gained significant importance in society, the economy and the public sector. Advances in artificial intelligence have attracted the attention of researchers and experts, offering a variety of valuable applications in the public sector. Artificial intelligence systems are designed to simulate or mimic human behavio to max efficiency and productivity, min errors, and ensure accurate and appropriate decision making. Consequently, artificial intelligence systems are oriented toward logical thinking and action by imitating the natural human decision-making process.^6,7

Obtaining information about health professionals’ professions may be difficult due to individual privacy concerns. The artificial intelligence-supported model is designed to predict the professions of health professionals who fill out the scale. The goal is to predict the profession of healthcare professionals by teaching PACPS to support artificial intelligence applications. Occupational prediction and a high degree of reliability are important for the adaptation of the study to other occupational groups. The research we planned is pioneering. There are no similar studies in literature.

Materials and methods

Design

This research is a descriptive study. Ethical permission was obtained from Necmettin Erbakan University Health Sciences Scientific Research Ethics Committee (Decision No:2025/915 08.01.2025).⁸

Participants

This research was evaluated using self-report scales and distributed to healthcare professionals (physiotherapists, dietitians, and nurses) working at Necmettin Erbakan University in Konya. Detailed information about the research was also shared via Google Forms in professional WhatsApp groups of physiotherapists, nurses, and dietitians. In this way, it was aimed to reach a larger sample. Participants were included in the study voluntarily in accordance with the Declaration of Helsinki between January and April 2025.⁹ Inclusion criteria: being between the ages of 18 to 64, and being a health professional (physiotherapist, dietitian, and nurse) for at least 2 years.¹⁰ Participants agreed to participate voluntarily after being informed about the study verbally and in writing. Participants were excluded from the study based on the following criteria: working less than 2 years; having cognitive, mental, or psychological problems; or having a chronic disease that will affect their life.¹¹

The sample size is the minimum number of participants required to conduct research by clinical significance. Although the power analysis program G*Power 3.1.9.7 is widely used in general research for the calculation of sample size, it is specifically applied to our research due to its capability to support artificial intelligence applications. Similar studies have stated that there should be 10 to 20 times the number of items in simple linear regression applications and at least 100 to 200 participants in logistic regression.^12,13 In Decision Trees machine learning research: it is recommended that there should be 200 to 500 participants. DT has considered the risk of overfitting. For K-Nearest Neighbors’ classification, over 100 datasets have been proposed.¹⁴

In this research, the calculation was made using the sampling theory formula: n (sample size) and Z (Z-score). For a 95% confidence interval, the Z-score corresponds to approximately 1.96 in the table $(n = [z^{2} . p . (1 - p)] / E^{2})$ . Since the probability of the event occurring in our research is unknown, p = 0.5 was taken (probability of failure of the model or error rate, 0.5, E: error margin, 0.05).¹⁵ The calculation determined that 385 people should be included in the study. And 242 people were included in our study. Data augmentation was then performed to increase the dataset to 385.

Outcome measures

PACPS: It consists of 13 questions. It is a five-point Likert-type scale with the following scoring: 1: Never, 2: Rarely, 3: Occasionally, 4: Most of the time, and 5: Always. The scale has a minimum score of 13 and a maximum score of 65. The standard error of the measure (SEM) and minimal detectable change (MDC) values calculated for the PACPS developed in Turkish were 0.81 and 2.25 points, respectively.⁵ The Physical Activity Counseling Practices Scale (PACPS) was developed and validated by Çankaya et al.⁵ As the scale developers, we have full rights to use it in the present study, and no additional copyright permission was required.

Prediction modeling

All experimental setup and analysis were performed using a PC with the following specifications. MacBook, M2, 8GB RAM, and 256GB storage capacity. The dataset was preprocessed using a Jupyter Notebook in Dataspell with Python version 3.13.2. In this study, five different machine learning algorithms were used to classify the dependent variable.

Random Forest (RF): Due to its capacity to learn complex and nonlinear relationships,¹⁶ K-Nearest Neighbors (KNN) has a simple structure and heuristic approach.¹⁷ Support Vector Machines (SVM): For its effective results with high dimensional and nonlinear datasets; Decision Tree (DT): Due to its easy interpretability and visualization advantages; Naive Bayes (NB): Because of its fast and efficient classification performance. TRIPOD-Cluster is a checklist used.

Sample size and augmentation: The original dataset comprised n = 242 respondents (used for descriptive tables). For prediction modeling we targeted n = 385 (based on the a priori calculation). To avoid information leakage, we first performed a stratified 70:30 split of the original dataset (train n = 126, test n = 116). Data augmentation was then applied only to the training set, increasing it to n = 269, while the test set remained unchanged (n = 116). This corresponds approximately to a 70:30 effective ratio. Synthetic samples were generated to preserve the marginal distributions of the 13 Likert-type PACPS items and the class balance; no participant-level identifiers were used. All hyperparameter tuning and cross-validation were conducted after this split and within the training data only. Results are reported exclusively on the untouched test set.

Machine learning algorithms

Model tuning and selection: We used an inner 3-fold stratified cross-validation with randomized/grid search for each algorithm.¹⁸ The primary selection metric was macro-F1 (to balance class-wise performance), with overall accuracy recorded as a secondary metric. Pipelines included scaling where appropriate. All preprocessing steps (e.g. scaling/encoding) were fit only on inner-fold training splits and applied to validation splits to avoid data leakage. After tuning, the best configuration was re-fit on the entire training set (n = 269) and evaluated once on the held-out test set (n = 116). Stratified splits and fixed random seeds ensured reproducibility.

In this study, five prediction models—RF, KNN, NB, SVM, and DT—were developed and compared using the PACPS items as predictors under a stratified 70:30 train–test split. The optimized models were then used to predict occupation based on PACPS responses.

Final performance was reported with accuracy, precision, recall, and F1-scores, and confusion matrices (Table 1) were generated to illustrate class-level prediction patterns. The individual machine learning algorithms are described below.

Table 1.

Test-set confusion matrices for each model (n = 116).

KNN (ROWS = true, COLS = pred)	1	2	3
1	35	4	2
2	3	31	3
3	6	3	29
SVM(rbf)
1	32	6	3
2	7	25	5
3	4	3	31
Random Forest
1	34	5	2
2	8	23	6
3	4	4	30
Naive Bayes
1	30	4	7
2	6	25	6
3	4	7	27
Decision Tree	1	2	3
1	26	11	4
2	3	26	8
3	3	9	26

ROWS = true class, COLS = predicted class (1 = Physiotherapist, 2 = Nurse, 3 = Dietitian).

SVM: Support Vector Machine.

Random forest

In this study, RF was optimized over the number of estimators (100–500), maximum tree depth, and split criteria (gini vs. entropy). The best configuration (n_estimators ≈ 300, max_depth = None, criterion = gini) achieved a test accuracy of 0.75. RF feature importance analysis indicated that PACPS12 (interprofessional counseling and collaboration) was the most influential item, consistent with the observed professional differences across physiotherapists, nurses, and dietitians. Although RF provided relatively high accuracy, deeper trees showed signs of overfitting compared to simpler models.

K-Nearest Neighbors

KNN was tuned for the number of neighbors (k = 3–15) and distance metric (Euclidean vs. Manhattan). The best setting was k = 7 with Manhattan distance (p = 1), reflecting the ordinal structure of the Likert-type PACPS items. This yielded the highest generalization among all models, with a test accuracy of 0.82 and balanced class performance, confirming the suitability of KNN for small-scale health datasets.

Naive Bayes

The multinomial variant of NB was applied due to the categorical/ordinal nature of the PACPS responses. No major hyperparameters required tuning beyond Laplace smoothing (alpha = 1). Despite its strong independence assumption, NB performed competitively, with a test accuracy of 0.71, and provided fast, robust classification with minimal risk of overfitting.

Support Vector Machine

SVM was optimized for kernel type, C, and gamma parameters. The best configuration was an RBF kernel with C = 10 and gamma = 0.1, which achieved a test accuracy of 0.76. This performance demonstrates the ability of SVMs to capture nonlinear decision boundaries in health survey data, while maintaining robustness under limited sample size conditions.¹⁹

Decision Tree

DT were tuned using GridSearchCV with candidate criteria (gini and entropy), class_weight settings (none and balanced), and depth constraints. The best configuration was criterion = gini, max_depth = 5, min_samples_split = 2, min_samples_leaf = 3, class_weight = balanced.

On the independent test set (n = 116), the DT achieved an accuracy of 0.67. Precision, recall, and F1-scores were 0.66, 0.68, and 0.67 for physiotherapists; 0.65, 0.66, and 0.65 for nurses; and 0.70, 0.68, and 0.69 for dietitians. These results show that DT produced intuitive and interpretable decision rules useful for clinical understanding. However, the model was susceptible to overfitting; while depth and leaf-size regularization partly mitigated this issue, performance remained lower than ensemble methods such as RF.

Overfitting control: For tree-based models, we applied depth and leaf-size constraints (e.g. DT: max_depth = 5, min_samples_leaf = 3; RF: tuning max_depth, min_samples_*, and class_weight = balanced) and used stratified splits. Data augmentation was restricted to the training folds only, and all preprocessing (scaling/encoding) was fit within the inner training folds and applied to validation folds to prevent leakage. Hyperparameters were selected via inner three-fold CV using macro-F1 as the primary metric; the best configuration was then re-fit on the full training set (n = 269) and evaluated once on the untouched test set (n = 116). We monitored train–test deltas in accuracy and macro-F1 to detect residual overfitting.

İstatistiksel Analiz

The SPSS IBM 29.00 software package was used for data analysis. The statistical significance level was determined as p = 0.05. The data obtained from the interview form created to determine the characteristics to be measured was analyzed using frequency and percentage distributions as part of descriptive statistics. The findings were presented in graphs and tables.²⁰ In this section, various data preprocessing steps were performed using Python 3.13.2 for statistical analysis of the dataset. First, the basic statistical properties of the dataset were analyzed using the Pandas, PyReadStat, Numpy, Matplotlib, and Sklearn libraries, which collectively help define the data.¹⁵ The operations performed on the dataset ensured that statistical analyses were conducted appropriately, and data optimization was achieved.²¹

Defining and Cleaning the Data Set; the dataset was obtained from an Excel file created through the system, from data collected with the help of Google Forms, using the PyReadStat library. The data was read from the Excel file obtained. Before performing basic statistical analysis on the dataset, a horizontal examination scheme was presented using the describe T function. General statistical properties of the dataset were extracted.²² Next, discrete data were found and removed, improving the quality of the dataset. It was divided into training and test sets for the model. The parameters of the models created with machine learning algorithms were optimized. The best parameters were determined with the help of GridSearch CV.²³ In this step, the features of the dataset were scaled.²⁴ The performance of the model was evaluated by dividing it into training/test sets.²⁵ The performance was reported. The reports were graphed and presented.

Results

Dataset

The dataset used for this study was obtained using the PACPS. The dataset consists of health professionals categorized as physiotherapists, nurses, and dietitians. The dataset does not contain sensitive personal information such as participants’ names or other identifying information. Sociodemographic information for all variables is shown (Table 2). The average PACPS scores by occupational groups, as well as overall means, are shown in Table 3. The PACPS 12 question had the lowest mean scale score in the dietitian occupational group, while the PACPS 1–11 and 13 questions had lower means in the nursing occupational group (Table 3).

Table 2.

Physical and sociodemographic characteristics of the participants (n = 242).

	PhysiotherapistM ± Sd (n = 84)	NurseM ± Sd (n = 76)	DietitianM ± Sd (n = 82)
Age (year)	30,90 ± 6,04	24,56 ± 3,73	28,65 ± 5,26
Height (m)	1.68 ± 0,16	1.63 ± 0,01	1.65 ± 0,02
Body weight (kg)	69,92 ± 12,49	67,83 ± 12,86	61,24 ± 10,63
BMI (kg/m²)	24,56 ± 3,73	25,29 ± 4,11	22,33 ± 3,53
Physical activity (8 week)	n (%)	n (%)	n (%)
Doing it (8w ≥)	43 (51.19)	45 (60)	51 (61.45)
It doesn't (8w <)	41 (48.81)	30 (40)	32 (38.55)
A condition that previously caused musculoskeletal disorders
There is	11 (13.1)	16 (21.05)	12 (14.63)
No	73 (86.9)	60 (78.95)	70 (85.37)

BMI: body mass index; M: mean, SD: standard deviation.

Table 3.

Physical Activity Counseling Practices Scale (PACPS) mean scores by occupation (n = 242).

		Overall average				Physiotherapist (n = 84)	Nurse (n = 76)	Dietician (n = 82)
PACPS Değerleri	Min–Max	%25	%50	%75	M ± SD	M ± SD	M ± SD	M ± SD
PACPS 1	1.0–5.0	3.0	4.0	5.0	3.73 ± 1.14	3.99 ± 1.01	2.99 ± 1.08	4.13 ± 1.00
PACPS 2	1.0–5.0	3.0	4.0	5.0	3.80 ± 1.13	4.03 ± 0.91	3.01 ± 1.14	4.29 ± 0.92
PACPS 3	1.0–5.0	4.0	5.0	5.0	4.29 ± 0.90	4.49 ± 0.59	3.63 ± 1.08	4.71 ± 0.62
PACPS 4	1.0–5.0	3.0	4.0	4.0	3.50 ± 1.18	3.48 ± 1.12	2.99 ± 1.10	3.99 ± 1.11
PACPS 5	1.0–5.0	3.0	4.0	5.0	3.85 ± 1.13	3.77 ± 1.17	3.52 ± 1.11	4.23 ± 1.01
PACPS 6	1.0–5.0	2.0	3.0	4.0	2.93 ± 1.33	3.39 ± 1.25	2.39 ± 1.21	2.96 ± 1.34
PACPS 7	1.0–5.0	2.0	3.0	4.0	3.20 ± 1.23	2.75 ± 1.18	3.12 ± 1.22	3.73 ± 1.08
PACPS 8	1.0–5.0	3.0	3.0	4.0	3.30 ± 1.24	3.31 ± 1.19	3.12 ± 1.30	3.46 ± 1.21
PACPS 9	1.0–5.0	1.0	3.0	4.0	2.60 ± 1.30	2.95 ± 1.14	2.40 ± 1.29	2.41 ± 1.40
PACPS10	1.0–5.0	3.0	4.0	4.0	3.46 ± 1.19	3.75 ± 0.85	2.71 ± 1.26	3.87 ± 1.11
PACPS 11	1.0–5.0	3.0	4.0	5.0	3.70 ± 1.13	3.92 ± 0.91	3.15 ± 1.18	3.98 ± 1.12
PACPS 12	1.0–5.0	1.0	3.0	4.0	2.90 ± 1.51	3.85 ± 1.28	2.61 ± 1.27	2.18 ± 1.43
PACPS 13	1.0–5.0	3.0	4.0	4.0	3.39 ± 1.21	3.73 ± 1.00	3.08 ± 1.18	3.33 ± 1.36
PACPS Total					3.46 ± 1.11	3.64 ± 1.04	2.99 ± 1.18	3.71 ± 1.16

M: mean; SD: standard seviation.

Random Forest

The parameter optimization for the RF model was performed using GridSearchCV. Candidate values of n_estimators (200, 300, and 400) were tested, and the best configuration was obtained with n_estimators = 300, max_depth = None, min_samples_split = 5, min_samples_leaf = 5, and class_weight = balanced.

Feature importance analysis indicated that PACPS12 was the most influential item (importance = 0.131) (Table 4). On the independent test set (n = 116), the overall accuracy of the RF model was 75%. The classification report showed precision, recall, and F1-scores of 0.74, 0.77, and 0.75 for physiotherapists; 0.70, 0.71, and 0.71 for nurses; and 0.80, 0.77, and 0.78 for dietitians, respectively (Table 5). These results suggest that while RF effectively identified relevant features, its generalization performance was slightly lower than that of KNN and SVM (Figure 1(a)).

Figure 1.

Training and test accuracy scores of Random Forest (a) and K-Nearest Neighbors (b) algorithms based on the PACPS dataset. PACPS: Physical Activity Counseling Practices Scale.

Table 4.

Feature importance of the Random Forest algorithm.

PACPS: Physical Activity Counseling Practices Scale.

Table 5.

Machine learning algorithms confusion matrix.
Feature	Importance
İPACPS12	0.1679
İPACPS14	0.1160
İPACPS7	0.1005
İPACPS3	0.0863
İPACPS10	0.0759
İPACPS2	0.0663
İPACPS6	0.0558
İPACPS1	0.0517
İPACPS13	0.0457
İPACPS9	0.0454
İPACPS12	0.1679
İPACPS14	0.1160
İPACPS7	0.1005

Random Forest	Precision	Recall	F1-score	Support
Physiotherapist	0.74	0.83	0.78	41
Nurse	0.72	0.62	0.67	37
Dietitian	0.79	0.79	0.79	38
Accuracy			0.75	116
Macro avg	0.75	0.75	0.75	116
Weighted avg	0.75	0.75	0.75	116
KNN algorithm
Physiotherapist	0.80	0.85	0.82	41
Nurse	0.82	0.84	0.83	37
Dietitian	0.85	0.76	0.81	38
Accuracy			0.82	116
Macro avg	0.82	0.82	0.82	116
Weighted avg	0.82	0.82	0.82	116
SVM algorithm
Physiotherapist	0.74	0.78	0.76	41
Nurse	0.74	0.68	0.70	37
Dietitian	0.79	0.82	0.81	38
Accuracy			0.76	116
Macro avg	0.76	0.76	0.76	116
Weighted avg	0.76	0.76	0.76	116
Decision Tree
Physiotherapist	0.81	0.63	0.71	41
Nurse	0.57	0.70	0.63	37
Dietitian	0.68	0.68	0.68	38
Accuracy			0.67	116
Macro avg	0.69	0.67	0.67	116
Weighted avg	0.69	0.67	0.68	116
Naive Bayes
Physiotherapist	0.75	0.73	0.74	41
Nurse	0.69	0.68	0.68	37
Dietitian	0.68	0.71	0.69	38
Accuracy			0.71	116
Macro avg	0.71	0.71	0.71	116
Weighted avg	0.71	0.71	0.71	116

Precision, recall, F1-score, support and accuracy are reported for physiotherapist, nurse, and dietitian groups based on the test set (n = 116).

K-Nearest Neighbors

Parameter optimization for KNN was performed using GridSearchCV. Candidate values of k (3–15) were tested with uniform and distance weighting schemes, and both Euclidean (p = 2) and Manhattan (p = 1) distance metrics were compared. The optimal configuration was k = 7 with Manhattan distance (p = 1) and uniform weights.

On the independent test set (n = 116), the overall accuracy of the KNN model was 82%, the highest among all models. The classification report indicated precision, recall, and F1-scores of 0.83, 0.80, and 0.81 for physiotherapists; 0.80, 0.84, and 0.82 for nurses; and 0.81, 0.83, and 0.82 for dietitians, respectively (Table 5). These results suggest that KNN achieved the best generalization performance across occupational groups (Figure 1(b)).

Support Vector Machines

Parameter optimization for SVM was performed using GridSearchCV. Candidate values of C (1, 10, and 100) and gamma (0.01, 0.1, and 1) were tested with both linear and RBF kernels. The optimal configuration was C = 10, gamma = 0.1, kernel = RBF, with class_weight fixed as balanced.

On the independent test set (n = 116), the overall accuracy of the SVM model was 76%. The classification report indicated precision, recall, and F1-scores of 0.75, 0.78, and 0.76 for physiotherapists; 0.74, 0.73, and 0.73 for nurses; and 0.80, 0.77, and 0.78 for dietitians, respectively (Table 5). These results demonstrate that SVM achieved strong and consistent performance across all professional groups, ranking just below KNN in overall generalization ability (Figure 2(a)).

Figure 2.

Training and test accuracy scores of Support Vector Machine (a), Decision Tree (b), and Naive Bayes (c) algorithms based on the PACPS dataset. PACPS: Physical Activity Counseling Practices Scale.

Decision Tree

The DT model's parameter optimization was performed using GridSearchCV.²⁶ Candidate criteria (gini and entropy), class_weight settings (none and balanced), and depth constraints were tested. The optimal configuration was criterion = gini, max_depth = 5, min_samples_split = 2, min_samples_leaf = 3, and class_weight = balanced.

On the independent test set (n = 116), the overall accuracy of the DT model was 67%. The classification report indicated precision, recall, and F1-scores of 0.66, 0.68, and 0.67 for physiotherapists; 0.65, 0.66, and 0.65 for nurses; and 0.70, 0.68, and 0.69 for dietitians, respectively (Table 5). These findings suggest that although the DT model provided interpretable decision rules, it remained susceptible to overfitting and demonstrated lower generalization compared with ensemble and kernel-based methods (Figure 2(b)).²⁷

Naive Bayes

The NB algorithm did not require extensive parameter optimization due to its simple structure. The multinomial NB variant was applied with Laplace smoothing (alpha = 1).

On the independent test set (n = 116), the overall accuracy of the NB model was 71%. The classification report indicated precision, recall, and F1-scores of 0.72, 0.71, and 0.71 for physiotherapists; 0.70, 0.69, and 0.69 for nurses; and 0.74, 0.73, and 0.73 for dietitians, respectively (Table 5). These findings suggest that NB achieved moderate but stable classification performance across occupational groups, serving as a robust baseline with minimal risk of overfitting (Figure 2(c)).

Discussion

This study evaluated whether the PACPS can support automatic prediction of health professionals’ occupations (physiotherapist, nurse, and dietitian) using five machine learning algorithms. The findings indicate that PACPS items contain distinctive response patterns that allow moderate occupation prediction.

Among the models, KNN achieved the highest test accuracy (82%), closely followed by SVM (76%). Both showed balanced performance across classes, suggesting good generalizability in a small-moderate sample.²⁸ The strong KNN result is consistent with scenarios where local structure in ordinal responses carries discriminative signal.²⁹

RF (75%) performed competitively and enhanced interpretability via feature importance, highlighting PACPS12 as the most influential item for differentiating professions. DT (67%) yielded the lowest performance, reflecting its known tendency to overfit in limited datasets despite intuitive rules. NB (71%)provided a robust baseline with simple assumptions and low variance.

These relative performances align with prior reports showing that model rankings are data- and context-dependent, with tree ensembles, instance-based learners, and kernel methods often trading places depending on sample size, class balance, and feature structure.³⁰ Our results reinforce that PACPS-based prediction is feasible, while emphasizing that dataset characteristics and careful model selection are crucial.

We implemented multiple safeguards against overfitting (inner cross-validation for tuning, stratified splits, training-only augmentation, and depth/leaf constraints for trees). Nevertheless, residual overfitting was still observable in tree-based models via train–test performance gaps, underscoring the need for additional regularization and larger samples. The confusion matrices (Table 1) further demonstrated that RF and DT tended to misclassify across groups, consistent with their overfitting tendencies.

Overfitting assessment and mitigation: Tree-based models showed clear signs of overfitting, as suggested by their train–test performance gaps and the misclassification patterns in Table 1. We implemented multiple safeguards: (i) training-only augmentation after the initial train–test split, (ii) inner CV-based hyperparameter regularization (e.g. limiting max_depthand increasing min_samples_leaf in DT; tuning max_depth/min_samples_ and class_weight = balanced in RF), (iii) stratified folds with fixed seeds, and (iv) monitoring train–test deltas in accuracy and macro-F1. These steps reduced but did not eliminate overfitting in DT/RF—consistent with small, imbalanced tabular datasets. Future work will evaluate stronger regularization and robustness strategies (e.g. postpruning and cost-sensitive training for DT, shallower RF ensembles and calibrated voting, grouped/blocked CV where applicable, and external validation) to improve generalizability.

Limitations: First, the self-administered survey design may introduce response bias. Second, the sample was modest (n = 242) and single-context, limiting generalizability; although a priori calculations suggested n≈385, only 242 unique respondents were recruited, which likely widened uncertainty around performance estimates.³¹ Third, despite our preventive steps, some overfitting remained—especially for tree-based models. Finally, as a cross-sectional analysis, temporal stability could not be assessed.

Implications and future work: Clinically, PACPS-based models can help characterize how counseling responsibilities are distributed across professional groups and may inform targeted training or workforce planning.³² Future studies should pursue larger, multicenter cohorts, consider stronger regularization (e.g. calibrated ensembles, monotonic constraints, or cost-sensitive learning), and include external validation and prospective designs to test real-world utility.³³ Integrating interpretable ML outputs (e.g. feature importances and exemplar-based explanations) with clinical expertise could further support multidisciplinary care pathways.

Conclusion

The unique aspect of the study is that it demonstrates the feasibility of predicting professional outcomes by combining a clinically based measurement tool, such as the physical activity counseling scale, with artificial intelligence. The results revealed that this scale reflects the differences in counseling approaches among health professionals. In this respect, the study has the potential to pioneer future occupational classification research with larger sample groups.

Relevance for clinical practice

It was found that the professions of health professionals can be determined with PACPS.

It was determined that it is a scale that can be used to estimate the profession of health professionals.

In clinical practice, those who fill out the scale will determine both the extent to which they provide physical activity counseling and what their profession may be.

Supplemental Material

sj-docx-1-dhj-10.1177_20552076251390279 - Supplemental material for Artificial intelligence supported professional prediction of Physical Activity Counseling Practices Scale in health professionals with a machine learning algorithm

Supplemental material, sj-docx-1-dhj-10.1177_20552076251390279 for Artificial intelligence supported professional prediction of Physical Activity Counseling Practices Scale in health professionals with a machine learning algorithm by Musa Çankaya, Ahmet Arda Ersöz, Şenay Burçin Alkan and Fatma Erdeo in DIGITAL HEALTH

Supplemental Material

sj-docx-2-dhj-10.1177_20552076251390279 - Supplemental material for Artificial intelligence supported professional prediction of Physical Activity Counseling Practices Scale in health professionals with a machine learning algorithm

Supplemental material, sj-docx-2-dhj-10.1177_20552076251390279 for Artificial intelligence supported professional prediction of Physical Activity Counseling Practices Scale in health professionals with a machine learning algorithm by Musa Çankaya, Ahmet Arda Ersöz, Şenay Burçin Alkan and Fatma Erdeo in DIGITAL HEALTH

Footnotes

Acknowledgements

The authors acknowledge, and thank the contribution of participants to the study. This article was presented as an oral presentation under the title “Artificial intelligence supported professional prediction of Physical Activity Counseling Practices scale in health professionals with Machine Learning algorithm” at the Presented as an oral presentation at the 1st National Konya Physiotherapy Congress. Konya, Turkey on 23–24 May 2025.

ORCID iDs

Musa Çankaya

Şenay Burçin Alkan

Author contributions

All authors contributed equally to all aspects of this work.

Ethics approval

This research is a descriptive study. Ethical permission was obtained from Necmettin Erbakan University Health Sciences Scientific Research Ethics Committee (Decision No:2025/915 08.01.2025). The study was conducted in accordance with the Declaration of Helsinki. Ethical Committee as applicable.

Informed consent

Informed e-consent was obtained from all participants via Google Forms before participation in the study. Verbal and written informed consent was obtained from the participants.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

For all the data described in this article, we have provided references to the original dataset sources.

Supplemental material

Supplemental material for this article is available online.

Artificial intelligence supported professional prediction of Physical Activity Counseling Practices Scale in health professionals with a machine learning algorithm

Abstract

Objective

Methods

Results

Conclusion

Keywords

Introduction

Materials and methods

Design

Participants

Outcome measures

Prediction modeling

Machine learning algorithms

Random forest

K-Nearest Neighbors

Naive Bayes

Support Vector Machine

Decision Tree

İstatistiksel Analiz

Results

Dataset

Random Forest

K-Nearest Neighbors

Support Vector Machines

Decision Tree

Naive Bayes

Discussion

Conclusion

Relevance for clinical practice

Supplemental Material

sj-docx-1-dhj-10.1177_20552076251390279 - Supplemental material for Artificial intelligence supported professional prediction of Physical Activity Counseling Practices Scale in health professionals with a machine learning algorithm

Supplemental Material

sj-docx-2-dhj-10.1177_20552076251390279 - Supplemental material for Artificial intelligence supported professional prediction of Physical Activity Counseling Practices Scale in health professionals with a machine learning algorithm

Footnotes

Acknowledgements

ORCID iDs

Author contributions

Ethics approval

Informed consent

Funding

Declaration of conflicting interests

Data availability

Supplemental material

References

Supplementary Material