Abstract
Background:
Training primary care doctors (PCDs) to manage common psychiatric conditions is seen as a key strategy to reduce the treatment gap, but its effects on their diagnostic and treatment skills remain largely unstudied.
Methods:
A hybrid cluster randomized controlled trial compared two PCD training methods: training as usual (TAU; two days of in-person lectures, control group [CG]) versus TAU plus online mental health training (OMHT; 18 hours of interactive modules, study group [SG]). Primary outcomes (diagnostic, treatment, and combined concordance) between PCDs and psychiatrists were evaluated using Cohen’s kappa (κ), with group comparisons made via paired t-test.
Results:
For identifying anxiety disorders, SG (κ = 0.41) performed better than CG (κ = 0.06; p = .02). Likewise, for somatization disorders, SG had moderate concordance (κ = 0.35), while CG had poor concordance (κ = 0.05; p < .01). For depressive disorders, SG had low concordance (κ = 0.12), while CG showed moderate concordance (κ = 0.58; p < .01). For antidepressant prescription, SG (κ = 0.35) performed better than CG (κ = 0.15; p = .03). Combined concordance for common mental disorders (CMDs) as a domain showed that SG (κ = 0.35) fared better than CG (κ = 0.15, p = .03). Conversely, for severe mental disorders (SMDs), CG (κ = 0.83) performed better than SG (κ = 0.35; p = .03).
Conclusions:
Add-on OMHT enhances PCDs’ diagnostic and treatment skills for select conditions, particularly anxiety and somatization, highlighting its potential as a scalable model.
Trial Registration:
The trial was registered with the Clinical Trial Registry of India (Registration No. CTRI/2024/02/062906).
Keywords
Add-on online mental health training (OMHT) improves the skills of primary care doctors (PCDs) to identify and manage common mental disorders (CMDs) in primary care settings. Real-time patient assessments of diagnostic and treatment agreement between psychiatrists and primary care doctors (PCDs) provide a strong reflection of how training them can translate into the latter’s clinical practice.Key Messages:
Strengthening primary care is a vital strategy for mitigating the treatment gap for mental health disorders. This gap, defined as the discrepancy between the number of individuals with mental health disorders and those receiving appropriate care, is particularly pronounced in low- and middle-income countries (LMICs) where resources are scarce.1,2 One workable way to close the gap is to incorporate mental health services into primary health care (PHC) systems, which enable patients to get treatment in a more accessible and familiar setting. 3
Moreover, evidence suggests that many patients prefer to obtain mental health services in primary care settings instead of being referred to a specialist. Studies have shown that a significant portion of patients, including older adults, often seek mental health treatment from primary care providers due to various barriers such as stigma, lack of time, and logistical challenges associated with accessing specialty services. 4
In the realm of healthcare capacity building, particularly for primary care professionals, the need for effective training programs is paramount. Primary care doctors (PCDs) often face challenges in identifying and managing psychiatric conditions due to limited exposure to psychiatric training and resources. To address this gap, online training programs have emerged as a feasible solution, offering accessible, flexible learning opportunities for busy healthcare providers. 5 However, evaluating the effectiveness of such programs requires robust methodologies to assess their impact on clinical practice.
The effectiveness of mental health training programs has been previously analyzed in various studies in India and abroad. Most of these studies used parameters such as knowledge, attitudes, and practices (KAP) questionnaires,6,7 case vignettes 8 and surveys 9 assessing knowledge acquisition and self-reported confidence. While these methods offer valuable insights into the trainees’ understanding, they have limitations in capturing the actual translation of knowledge into clinical practice. Randomized controlled trials (RCTs) are widely regarded as the most rigorous method for evaluating the effectiveness of interventions in healthcare research. 10 Moreover, the use of RCTs in this domain has been relatively scarce. Most existing RCTs have concentrated on knowledge transfer, evaluating the impact of training programs on theoretical understanding, and fall short in measuring the practical application of the acquired knowledge in real-world settings.6–12
This study addresses this gap by incorporating a more robust method of evaluating effectiveness. Rather than relying solely on self-reported data, it uses objective clinical observations to assess the trainees’ ability to apply their learning in clinical scenarios. Furthermore, it measures the concordance between the trainees’ diagnoses and treatment choices with those of psychiatrists to quantify the level of agreement.
Methods
This study focuses on the primary outcomes of an RCT, done as part of the impact and outcome evaluation of a “Pan India Digitally Driven Capacity Building Program to strengthen Primary Mental Health Care” in India. The study was conducted in the Tumkur district of Karnataka state, utilizing a hybrid cluster randomized design. The study was approved by the Institutional Ethics Committee (Approval No. NIMHANS/43rd IEC (BEH.SC.DIV) 2023, dated 8th December 2023) and registered with the Clinical Trial Registry of India (Registration No. CTRI/2024/02/062906). A detailed account of the methodology is discussed in the article titled “An Effectiveness–Implementation Hybrid Cluster Randomized Controlled Trial to Evaluate Add-On Online Mental Health Training (OMHT) for PCDs in Influencing Their Management of Commonly Prevalent Psychiatric Disorders: Description of the Methodology” in this supplement, and only the essential points are summarized below.
The RCT included PCDs from PHCs in all but one Taluks in Tumkur, using cluster randomization so all doctors within a Taluk (cluster) were assigned to either the study group (SG) or the control group (CG). Eligible participants were full-time PHC employees providing direct patient care who gave informed consent; those with specialist psychiatry training were excluded. Written informed consent was taken from both PCDs and patients/caregivers.
Following randomization, an online training link, which delivered a structured mental health curriculum, was shared with all 91 PCDs of SG. Add-on OMHT was conducted over 18.5 hours (12 hours initially and an additional 6.5 hours of revision classes to improve the attendance of the PCDs). Topics included depression, anxiety, psychosis, somatization, alcohol use disorder, and tobacco addiction. Supportive handholding was done using collaborative video consultations (CVCs), designed to facilitate practical application in primary care settings. CVC support was continued for three months.
Both SG and CG received training as usual (TAU) as a standard practice. Typically, it involves two days of in-person training, predominantly didactic lectures, conducted often in district headquarters. These annual sessions cover all PCDs in the district. CG PCDs received OMHT and supportive handholding through CVC after data collection was completed.
Following the training, a team of research psychiatrists visited the selected PHCs to assess outcomes. Among the 91 SG PCDs who received the link, 13 doctors who attended a minimum of four sessions were chosen for analysis. Similarly, 13 out of the 49 CG PCDs were randomly selected for analysis.
The PCDs screened patients for psychiatric conditions, providing initial diagnoses and treatment plans. These same patients were subsequently evaluated by an assessor (a psychiatrist) blinded to the training status, who also provided diagnostic and treatment recommendations (Figure 1). The final study sample included 577 patients across both arms. The baseline assessments were done in May–June 2024. Concordance between the PCD and psychiatrist was measured for both diagnosis and treatment using Cohen’s kappa (κ). The resulting κ values were then compared between SG and CG to evaluate the effectiveness of the training in improving diagnostic and treatment accuracy. To provide a more stringent and clinically meaningful estimate of agreement, a “combined concordance” metric was also calculated, which reflected scenarios where both the diagnosis and the treatment decisions (put together) made by the PCD matched those of the psychiatrist.
Consort Flow Diagram.
Diagnostic concordance was measured across three domains: Common mental disorders (CMDs), severe mental disorders (SMDs), and substance use disorders (SUDs). Domains were chosen based on the Clinical Schedule for Primary Care Psychiatry (CSP): Version 2.4. 13 The CMD domain encompasses diagnoses of depressive disorder, anxiety disorder, somatization disorder, and mixed disorder. SMDs include all psychotic disorders. Alcohol use disorders and tobacco addiction are grouped as SUDs. CSP was chosen because it helps PCDs establish psychiatric caseness and achieve a broader diagnosis, which is considered pragmatic in primary care settings. It is validated and has relatively high sensitivity with reasonably high specificity. 14
Statistical Analysis
Statistical analyses were done usingStatistical Package for Social Sciences (SPSS) licensed version 29. The agreement between PCDs and psychiatrists was measured using κ. The κ value was interpreted as follows: < 0.20, poor; 0.21–0.40, fair; 0.41–0.60, moderate; 0.61–0.80, good; 0.81–1.00, very good (Altman, 1991). 15 The κ values of both groups were compared using a paired t-test (Figure 2)
(a) Comparison of the Diagnostic Concordance of the Study Group with the Control Group. (b) Comparison of the Treatment Concordance of the Study Group with the Control Group. (c) Comparison of the Combined Concordance of the Study Group with the Control Group.
Forest plots comparing diagnostic, treatment, and combined concordance between Study Group (SG) and Control Group (CG). O-values for between group comparisons are presented here.
Results
Demographic details of PCDs are presented in Table 1. SG and CG were similar across baseline variables. There were no significant differences in age (38.77 vs. 41.85 years; p = .41), sex (more females in SG; p = .11), years since obtaining medical degree (MBBS; 15.23 vs. 18.08; p = .42), or work experience (p = .52). Both groups had comparable exposure to psychiatry during MBBS and internship (p = .59), with similar duration of training. Recent psychiatric training and TAU completion rates were also similar. The SG attended a mean of 9.38 (±3.20) training sessions, which was not applicable for the CG.
Socio-demographic Profile of PCDs in the Study and Control Groups
*Includes any psychiatric training in the past one year, such as DMHP training or online modules related to psychiatry.
#DMHP: District Mental Health Program
Figure 1 gives the flowchart for patient recruitment. Table 2 contains demographic and clinical details of patients with some psychiatric illnesses. Socio-demographic and clinical characteristics were assessed for all psychiatric patients included in the study (N = 235; SG=118; CG=117). Both groups were comparable across key variables, including mean age (55.19 vs. 51.86 years, p = .09) or gender distribution (females: 51.7% vs. 48.7%, p = .65). The majority in both groups were from rural areas and below the poverty line. Past treatment for mental health issues was reported by 7.6% in the SG and 5.1% in the CG (p = .43). Family history of psychiatric illness was present in roughly one-fourth of both groups, most commonly related to substance use. Medical comorbidities were also similar. Median duration of disease and untreated illness for CMD, SMD, and SUD showed no significant differences.
Sociodemographic and clinical profile of patients
#CMD – common mental disorders @SMD- severe mental disorders
$SUD – substance use disorders *HTN – Hypertension
^DM – Diabetes Mellitus ^^CVA – Cardiovascular Accident
Table 3 summarizes domain-wise diagnostic concordance. Both groups had moderate agreement for any psychiatric diagnosis (SG: κ 0.40, CI: 0.29–0.51; CG: 0.47, CI: 0.37–0.58). For CMDs, both showed fair agreement (SG: 0.31, CI: 0.19–0.43; CG: 0.33, CI: 0.19–0.46). In SMDs, CG performed better (SG: 0.52, CI: 0.21–0.83; CG: 0.76, CI: 0.50–1), indicating moderate and good agreement. For SUDs, CG again outperformed SG (CG: 0.61, CI: 0.49–0.73; SG: 0.48, CI: 0.33–0.63), with moderate and good agreement.
Domain wise diagnostic concordance
*PCD – Primary Care Doctor @SMD- severe mental disorders
#CMD – common mental disorders $SUD – substance use disorders
Table 4 presents diagnostic concordance by disorder. For alcohol use disorders, both SG and CG had moderate agreement, with κ values of 0.60 (CI: 0.40–0.81) and 0.59 (CI: 0.32– 0.85), respectively. Similar results were observed for tobacco addiction: 0.46 (CI: 0.30–0.63) for SG and 0.59 (CI: 0.46–0.71) for CG. In depressive disorders, SG reported a κ of 0.12 (CI: −0.05–0.31), while CG had a κ of 0.58 (CI: 0.38–0.79). For anxiety disorders, SG’s κ was 0.41 (CI: 0.15–0.67) and CGs was 0.06 (CI: −0.07–0.19). Somatization disorder showed κ values of 0.35 (CI: 0.20–0.50) for SG and 0.05 (CI: −0.1–0.2) for CG. Mixed disorders had κ values of 0.27 (CI: 0.04–0.51) in SG and 0.07 (CI: −0.08–0.23) in CG. For psychosis, CG’s κ was 0.76 (CI: 0.50–1.0) and SGs was 0.52 (CI: 0.21–0.83). Other psychiatric diagnoses showed κ values of 0.18 (CI: −0.06–0.43) for SG and 0.19 (CI: −0.06–0.44) for CG.
Disorder wise diagnostic concordance
Between-group statistical comparisons (p values) are presented in
Table 5 presents an overview of treatment concordance. Both groups demonstrated fair agreement regarding the use of ‘any’ psychotropics (SG: κ = 0.35 [CI: 0.24, 0.46]; CG: κ = 0.34 [CI: 0.24, 0.43]). For antipsychotic prescriptions, CG showed very good agreement (κ = 0.83 [CI: 0.6, 1]), surpassing SG, which exhibited moderate agreement (κ = 0.49 [CI: 0.14, 0.84]). Conversely, with antidepressant prescriptions, SG achieved fair agreement (κ = 0.35 [CI: 0.22, 0.48]), performing better than CG, which showed poor agreement (κ = 0.15 [CI: 0.02, 0.27]). Both groups had comparable levels of agreement for benzodiazepine prescriptions overall; however, in cases involving SUDs, the SG demonstrated moderate agreement (κ = 0.43 [CI: 0.14, 0.72]), whereas the CG showed fair agreement (κ = 0.28 [CI: −0.15, 0.72]).
Comparison of treatment concordance
Between-group statistical comparisons (p values) are presented in
Table 6 gives details on the combined concordance. For CMDs, SG (κ = 0.35; CI: 0.22–0.48; fair agreement) fared better than CG (κ = 0.15, CI: 0.03–0.27; poor agreement). For SMDs, CG (κ = 0.83; CI: 0.60–1.00; very high agreement) performed better than SG (κ = 0.35; CI: −0.02–0.72; fair agreement). For SUDs, both groups showed fair agreement (SG: κ of 0.34 [CI: 0.19–0.49] and CG: κ of 0.26 [CI: 0.13–0.39]).
Comparison of Combined concordance (diagnosis and treatment put together)
*PCD: Primary Care Doctor @SMDs: severe mental disorders
#CMDs: common mental disorders *SUDs: substance use disorders
Figure 2 shows a comparison of κ values for diagnostic, treatment, and combined concordance. For diagnosis, CG has done statistically better for depressive disorders (p < .01) and SG has done better for anxiety (p = .02) and somatization disorders (p < .01). For treatment, SG has shown statistically better outcomes for antidepressant prescriptions (p = .03). For combined concordance, SG has done much better for CMDs (p = .03) while CG has done better for SMDs (p = .03).
Discussion
The study found that add-on OMHT improved PCDs’ ability to identify anxiety and somatization disorders (Figure 2a) and increased antidepressant prescriptions in the SG. However, OMHT showed no advantage in diagnosing or treating psychoses or SUDs. These results emphasize the potential of sustained support and structured training interventions while highlighting areas for further refinement.
Diagnostic concordance was moderate for most psychiatric conditions in both groups, reflecting existing challenges in diagnosing mental illnesses at the primary care level. For anxiety and somatization disorders, SG performed statistically better, but for depressive disorders, CG scored higher. This divergence likely reflects the inherent diagnostic complexity of CMDs in primary care, where patients frequently present with overlapping symptom clusters. Depressive disorders in such settings are often identified through prominent somatic or mixed presentations. PCDs in SG, having undergone OMHT, were trained to consider differential diagnoses systematically. This may have led them to recognize and classify a portion of these presentations as anxiety or somatization, thereby improving concordance in those domains but lowering it for depression. Importantly, SG demonstrated significantly better concordance in antidepressant prescriptions, suggesting that treatment decisions for depression were more closely aligned with psychiatrists. Moreover, when diagnosis and treatment were considered together (combined concordance), SG outperformed CG for CMDs as a domain. Measures such as continued handholding, as in the case of CVCs and case-based learning, can improve recognition and management of CMDs, especially depression and anxiety. CVCs are a method of tele-mentoring, wherein a PCD can instantly connect to a telepsychiatrist for expert guidance and advice to resolve diagnostic challenges or refine treatment plans. It offers real-time second opinions from board-certified psychiatrists. Of the several studies done on CVCs,16–19 one 16 measured the diagnostic concordance between the PCD and the telepsychiatrist. An 83% concordance was noted between them (κ = 0.78), indicating high agreement and the model’s effectiveness. Therefore, the diagnostic accuracy of CMDs can improve by incorporating CVCs into training programs. The article titled “Redefining Access to Mental Health Care through Sustained Tele-mentoring: A Report of the Instant CVCs of a Telepsychiatrist with PCDs,” in this supplement, also discusses the domain-wise concordance for PCDs trained under this project.
For psychoses, though CG achieved good agreement (κ = 0.76) compared to moderate agreement in SG (κ = 0.52), the difference was not statistically significant. The κ estimates for SMDs should be interpreted with caution because the absolute number of SMD cases in our sample was small (Table 6). κ is highly sensitive to low prevalence, so the classification of one or two cases—for example, patients who are on antipsychotics and clinically maintaining well or those who present with early or atypical psychosis—can produce large swings in κ and create apparent differences between groups.
In the domain of SUDs, although diagnostic concordance for alcohol and tobacco use was moderate-to-good in both groups, treatment and combined concordance for SUDs were only fair. One practical explanation is availability and implementation constraints at the PHC level. For example, nicotine replacement therapy (NRT) is not stocked in most PHCs, so PCDs who recognize tobacco dependence may still not prescribe NRT, lowering measured treatment concordance. Table 5 shows low κ for NRT and modest κ for counseling and benzodiazepine use in SUD contexts, consistent with these implementation gaps. As a result, combined concordance for SUDs (which requires agreement on both diagnosis and treatment) is modest and does not differ significantly between groups (Table 6). Another recent study evaluating the impact of training on PCDs for SUD management revealed that although training led to improved knowledge and attitudes, factors such as stigma, time constraints, and competing clinical priorities limited the consistent use of screening tools and the regular diagnosis of SUDs. These findings suggest that while training enhances skills and competence, systemic and logistical factors may limit its practical impact.19,20
Comparison with Other Studies Assessing the Effectiveness of a Training Program
A similar study 21 compared hybrid (online and in-person) training with fully digital training for PCDs, finding that blended training led to better identification of patients with mental illness over an eight-month follow-up period. The study analyzed self-reported case numbers and demonstrated a significant increase in patients identified and treated, particularly for SMDs and SUDs. Another study 22 which compared blended training with fully digital training reported that the PCDs were able to identify more CMDs.
While hybrid training may be better in enhancing skill transfer, it often faces challenges in terms of implementation and scalability, particularly in resource-constrained settings. OMHT, however, has shown its utility in empowering PCDs and should have its own space in the larger scheme of things. Going forward, onsite orientation and sustained online mentoring could be the answer to training the public healthcare workforce.
Another cluster RCT 23 aimed at improving knowledge of CMDs following a single-session training program consisting of didactic teaching and case-based discussion, showed significant improvements. However, our study’s focus on diagnostic and treatment concordance offers a more comprehensive assessment of training effectiveness by including practical applications of the knowledge gained.
Two more RCTs7,12 demonstrated that physicians in the intervention groups experienced a significantly larger incr-ease in knowledge compared to those in the CGs. One of them 7 used the Diagnostic Knowledge Inventory, a reliable scale consisting of case vignettes that are each 18 paragraphs long, followed by a list of diagnostic response options and potential treatment suggestions. While case vignette-based assessments offer a robust way to measure knowledge gain post-training, our study employed real-time patient observation, which provides a significant advantage in evaluating practical skills and applying theoretical knowledge in clinical settings.
Another 12 measured the improvement in knowledge of depression and behaviors toward depressed patients in primary care physicians following a “Depression Education Program” showed that physicians in the intervention group were more likely than those in the CG to ask about at least five criteria for major depression, discuss the possibility of depression, schedule a follow-up visit within two weeks, and achieve higher scores on patient satisfaction scale. Outcomes included a knowledge test 2–6 weeks post-intervention and office visits from two unanticipated individuals posing as standardized major depressive disorder patients. These patients assessed physicians using checklists and scales. Logistic and linear regression accounted for variables such as sex, specialty, and any suspicion of the patients being standardized. Although this study used a simulated clinical setting for assessing the translation of knowledge in practice through standardized patients, our study focused on real-time patient observations, offering a more naturalistic approach.
Strengths
Methodological
Cohen’s kappa (κ) provides a robust measure of agreement, surpassing the limitations of self-reported data. Furthermore, the hybrid cluster randomized design enhances the generalizability of the findings, making them applicable to other resource-limited settings.
Relevance to LMIC Settings
The findings highlight a feasible approach for strengthening primary-level mental health services in resource-constrained settings. Training PCDs can enhance early identification and basic management of common psychiatric disorders where specialist access is limited. However, such task-sharing works best when embedded within a collaborative framework that includes sustained mentoring, periodic retraining, and strong referral linkages, ensuring safe and effective care delivery.
Domain-specific Insights
The study provides granular data on diagnostic and treatment concordance across multiple psychiatric domains, offering actionable insights for refining future training programs.
Limitations
Focus on Short-term Outcomes
The study measured concordance shortly after training, which might not reflect long-term knowledge retention or skill application. A follow-up evaluation to assess sustained impact would add valuable insights.
Possibility of Selection Bias
Although the inclusion criterion of attending ≥4 sessions (in SG) and random selection (in CG) was intended to ensure adequate exposure and methodological balance, this approach may introduce selection bias, as attenders may differ from non-attenders.
Lack of Cost-effectiveness Analysis
As the broader trial is intended to inform scalable models of capacity building in primary care, an accompanying economic evaluation would have provided more policy-relevant insights into the feasibility and sustainability of implementing such training at scale.
Conclusions
Add-on OMHT may significantly enhance the diagnostic and treatment competencies of PCDs, particularly for anxiety and somatization disorders. Further research is needed to explore the long-term impact of such interventions and refine strategies for conditions with lower concordance, such as depressive disorders.
Supplemental Material
Supplemental material for this article is available online.
Footnotes
Acknowledgements
Same as the “Introduction ” article of this issue (Indian J Psychol Med. 2026;48(1 suppl)).
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Author Channaveerachari Naveen Kumar is the Principal Investigator of this project and supplemental issue. The author did not take part in the peer review or decision-making process for this submission and has no further conflicts to declare.
Declaration Regarding the Use of Generative AI
The authors utilized ChatGPT for only occasional writing assistance. After employing this tool, the authors carefully reviewed and edited the content as necessary and take full responsibility for the final publication.
Ethical Approval
Name: Institutional Ethics Committee (IEC) OF NIMHANS.
Approval number with date: NIMHANS/43rd IEC (BEH.SC.DIV) 2023, dated 8 December 2023, NIMHANS/EC(BH.SC.DIV.) MEETING/2024 dated 25 October 2024 and NIMHANS/EC(BEH.SC.DIV.) MEETING/2025 dated 1 July 2025. Appropriate permissions from the concerned authorities were taken.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This article, “Diagnostic and Treatment Concordance Among PCDs Delivering Primary Mental Health Care: Results from an Effectiveness–Implementation Hybrid Cluster Randomized Controlled Trial to Compare Two Methods of Training,” under the research project “A Pan India Digitally Driven Capacity Building Program to Strengthen Primary Mental Health Care,” was funded by the CSR initiative of a multinational company.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
