Abstract
Mixed methods research provides a valuable opportunity to deepen our understanding of phenomena. However, practical guidance on using the method to develop methodological models in clinical practice guidelines (CPGs) field is limited. This article illustrates the use of exploratory sequential mixed methods design to develop a methodological model to standardize the formation of good practice recommendation (GPR) within CPGs in the context of weak evidence or no evidence. It demonstrates how qualitative and quantitative data can be systematically integrated throughout the model construction and validation. The GPR methodological model includes three main themes: connotation, procedure and methods, and reporting guideline. It assists CPGs developers in GPR formulation, aligns researchers’ methodological understanding of GPR, and informs future GPR methodological development.
Keywords
Introduction
Mixed methods research offers the unique ability to integrate quantitative and qualitative approaches, providing a more comprehensive understanding of complex phenomena than either method alone can achieve (Creswell & Plano Clark, 2017). Methodological models are essential in clinical practice guidelines (CPGs) development, providing a structured framework to ensure the scientific validity of recommendations. Traditionally, the creation of methodological models in CPGs field relies on literature reviews and expert consultation, such as Grading of Recommendations Assessment, Development and Evaluation (Guyatt et al., 2008) and Appraisal of Guidelines for Research and Evaluation (Brouwers et al., 2010). Practical guidance on employing mixed methods research for developing methodological models in CPGs field remains scarce.
There are three basic designs in mixed methods research: the convergent design, the explanatory sequential design, and the exploratory sequential design (Creswell & Plano Clark, 2017). In the exploratory sequential design, the qualitative phase precedes the quantitative phase (Curry & Nunez-Smith, 2015). It is useful when researchers aim to integrate qualitative and quantitative data, combining theoretical deduction with empirical analysis to thoroughly explore and develop new models (Brown et al., 2020).
This paper presents how to develop a methodological model for formulating good practice recommendation (GPR) in CPGs using an exploratory sequential mixed methods design. It illustrates how mixed methods can be applied to create such methodologies model within the field of CPGs.
Study Background
As a promising tool, CPGs have gained widespread utilization within the clinical practice. Presently, the formulation of evidence-based recommendations within CPGs predominantly hinges on evidence obtained from systematic reviews. It has been noted that the needs raised by clinicians are not always reflected in the research conducted (Verwoerd et al., 2021). However, guidance is still needed for important topics or questions (such as public health emergencies), even if the evidence is weak or nonexistent. But, for such guidance, a formal rating of certainty is inappropriate (Guyatt et al., 2015). Evidence, clinical practice experience, and patient willingness are three main elements of evidence-based medicine (Sackett et al., 1996). It reiterates the importance of clinical practical experience (Djulbegovic & Guyatt, 2017; Sackett et al., 1996). In the process of developing CPGs, expert clinical experience plays an important role in forming recommendations based on existing evidence. When the evidence is weak or nonexistent, expert clinical experience is an important supplement and the main basis for forming guidance (Schünemann et al., 2019).
In this study, when evidence is weak or nonexistent, the recommendation provided is collectively referred to as GPR. The methods used to form GPR have garnered considerable attention. Many different methods have been proposed by researchers and can be divided into two main types: those based on evidence grading system and those not based on evidence grading system (Additional File 1). The earliest method can be traced back to 1979 (Spitzer, 1979). This method included expert opinion as the lowest level of evidence in the evidence hierarchy. It was based on evidence grading system. Other similar methods emerged one after another (Bigos et al., 1996; Eccles et al., 1996; Gross et al., 1994; Harbour & Miller, 2001; Oxford Centre for Evidence-Based Medicine, 2000). It was not until 2004 that the Grading of Recommendations Assessment, Development, and Evaluation system was formally introduced (Atkins et al., 2004), and subsequently became widely used in CPGs’ development. Subsequently, researchers began to propose related methods that were not based on an evidence grading system (Institute of Medicine [IOM], 2011; Oxman et al., 2006), such as expert opinions and good practice point. Meanwhile, methods that were based on an evidence grading system continued to be proposed (IOM, 2011). The updated definition of CPG emphasizes the importance of systematic evidence and scientific recommendations (IOM, 2011). Regarding the lack of evidence in CPGs development, many different methods that were not based on evidence grading system were proposed, for example, good practice statement (Guyatt et al., 2015, 2016), consensus-based recommendation (Scottish Intercollegiate Guidelines Network [SIGN], 2019), consensus recommendation (American Academy of Orthopaedic Surgeons [AAOS], 2022; National Institute for Health and Care Excellence [NICE], 2020), and so on. At the same time, methods that were based on evidence grading system were also still proposed (American Psychiatric Association [APA], 2020; European Society of Cardiology, 2022; European Society for Medical Oncology [ESMO], 2022; U.S. Preventive Services Task Force, 2021; Halperin et al., 2016; Mustafa et al., 2021). The inclusion of expert opinion in the hierarchy of evidence is the main difference between these two types of methods. Expert opinion is not a type of study design and should not be used as evidence (Oxman et al., 2006). Methods that are not based on evidence grading system are better suited for developing recommendations in CPGs when the evidence is weak or nonexistent. The methods, which are not based on evidence grading system, would be used in this study (hereinafter collectively referred to as GPR methods).
GPR methods provide methodological guidance for the development of GPR and have a certain degree of applicability in the formulation of CPGs. However, these methods have some degree of limitations. Taking the commonly used good practice statement and good practice point as an example, the literature suggests they are applied when evidence is insufficient, but does not provide clear definitions (Guyatt et al., 2015; Guyatt et al., 2016; SIGN, 2019). Additionally, indirect evidence is considered in the formation of good practice statement, but not explicitly in good practice point (Guyatt et al., 2015; Guyatt et al., 2016; SIGN, 2019). At the application level, good practice point lacks specific guidance, while good practice statement offers conditions for its use, though it provides little detail on its concrete operational implementation (Guyatt et al., 2015; Guyatt et al., 2016; SIGN, 2019).
These limitations have partly contributed to confusion in GPR methods application, potentially affecting the consistency of GPRs. For example, different CPGs provide inconsistent GPRs as to whether bariatric surgery should be performed in Asian obese populations. One CPG (NICE, 2023) considers obese people of Asian origin eligible for bariatric surgery with a BMI of 27.5 kg/m2 or higher, or between 22.5 and 37.4 kg/m2 with significant health issues. While another (Mechanick et al., 2019) suggests adjusting BMI criteria for bariatric surgery in Asian populations but does not specify thresholds. For clinicians and health policy-makers, inconsistent GPRs on the same important question in similar CPGs could cause problems in decision-making (De Leo et al., 2023). Consequently, it may affect the guiding role of GPR in clinical practice that they should have played. By analyzing how these GPRs were developed, we found that the inconsistency mainly stemmed from the two CPGs developers relying on different sources of information when formulating their respective GPR. One is based on expert experience (NICE, 2023), and one is based on indirect evidence and World Health Organization document (Mechanick et al., 2019). The lack of a harmonized and standardized methodological system for the formation of recommendations in CPGs could affect the consistency and reliability of the recommendations (IOM, 2011; Milojevic et al., 2024). If there is a systematic approach to standardizing GPR formation, this would help to reduce the above inconsistencies. Therefore, it is necessary to establish a comprehensive and widely applicable GPR methodological model (GPR-MM), providing methodological support for the scientific formation of GPR.
This study adopts an exploratory sequential mixed methods design to develop the GPR-MM. The specific objectives are as follows: (1) synthesize the GPR methods and the related information about GPR in CPGs using best fit framework synthesis to determine the structure, definitions and relationships of the model, and (2) validate the rationalization of the model’s construction by confirmatory factor analysis.
Methods
Study Design
The design selected for this study was an exploratory sequential mixed methods design, involving a qualitative exploration stage (steps 1 to 3 in Figure 1), followed by a quantitative evaluation stage (steps 4 to 5 in Figure 1), and concluding with a qualitative and quantitative integration (step 6 in Figure 1). This design facilitated the integration of qualitative and quantitative stages. Through building, qualitative findings were transformed into the data extraction framework for the quantitative stage. As for the qualitative and quantitative integration, it was done by integrating the results of the two stages to optimize the model. We followed the Good Reporting of A Mixed Methods Study (GRAMMS) checklist for reporting mixed methods research (O’Cathain et al., 2008). The exploratory sequential design used to develop good practice recommendation methodological model. Note. CPGs: Clinical practice guidelines; GPR: Good practice recommendation. The layout format was adapted from Moubarac et al. (2012).
Study Sample
The study sample comprised GPR methods and CPGs containing GPR. To identify relevant study sample, systematic searching and screening were conducted. The specific retrieval strategy and eligibility criteria were provided in Additional File 2. The citation lists of the included sample were also retrieved and searched. One author screened the study sample, and another reviewed the screening results. The included CPGs were randomized into two groups (Group 1 and Group 2) using SPSS 17.0 (SPSS Inc., Chicago, IL, USA). All identified GPR methods and CPGs of group 1 were dedicated to the construction of the initial GPR-MM in qualitative stage. All included CPGs were used for the quantitative stage of validation. The use of the study sample at different stages is shown in Figure 2. Distribution of sample use at different stages of the study. Note. CPGs: Clinical practice guidelines; GPR: Good practice recommendation.
Methods of Qualitative Stage
The qualitative stage (steps 1 to 3 in Figure 1) was exploratory and had two objectives: 1) to form the initial GPR-MM; and 2) to assess the plausibility, theme saturation, and robustness of the initial GPR-MM.
Qualitative Data Collection
The text of all identified GPR methods and CPGs of group 1 was imported into NVivo software (Version 12, QRS International) as qualitative data for the creation of the initial GPR-MM.
Qualitative Data Analysis
The best fit framework synthesis is a pragmatic and flexible approach to synthesizing theories or methods with findings from practice (Carroll et al., 2011; C. Carroll et al., 2013). It was employed to develop the themes, subthemes (hereafter referred to as (sub) themes) and categories of the initial GPR-MM by generating the a priori framework and coding data from the included CPGs against the framework. Firstly, The related concepts in the GPR methods were identified and embedded in the coding frame (the coding frame was formed based on the logical framework analysis, as detailed in Additional File 3) to create the a priori framework using thematic analysis (Braun & Clarke, 2006). Then, the CPGs of group 1 were coded using the a priori framework. New (sub) themes or categories were created through secondary thematic analysis of information that could not be coded using the framework.
Model Development and Test
The initial GPR-MM was developed by combining both a priori and newly identified (sub)themes and categories. The information underpinning the model was revisited, and the relationships between the (sub) themes and categories were established. The model was sent to the other authors for review. To test the plausibility, theme saturation, and robustness of the initial model, three approaches were implemented. First, the differences between the a priori framework and the initial model (Carroll et al., 2013) were explored. Second, a thematic saturation table (Constantinou et al., 2017) was used to explore the saturation. When no new (sub) themes or categories appeared, it was considered that saturation had been reached. Finally, sensitivity analysis (Carroll et al., 2013) was performed by testing whether the qualitative synthesis was affected by the omission of CPGs developed by individuals.
Methods of Quantitative Stage
The quantitative stage (steps 4 to 5 in Figure 1) was validation. Confirmatory factor analysis was used to validate the construct validation of the initial GPR-MM.
Quantitative Data Collection
The initial model was used as a data extraction framework to collect quantitative data. Before the data collection, the CPGs of group 2 were coded against the initial model. And the coding results from the CPGs of group 1 and 2 were combined. In NVivo software, one piece of coded content in CPG is called a reference. The higher the number of references for a (sub)theme or category, the greater the emphasis it receives in CPGs. This emphasis also indicates the importance and attention that guideline developers attach to the (sub)theme or category. According to the structure of the data extraction framework, the reference numbers of (sub)theme or category in each CPG were extracted from the combined results as the quantitative data for analysis.
Quantitative Data Analysis
Given the structure of the initial GPR-MM, second-order confirmatory factor analysis model was used to assess its construct validity. A sample size to parameter number ratio of five or ten was considered appropriate (Bentler & Chou, 1987). Sampling adequacy was assessed using Bartlett’s test and the Kaiser–Meyer–Olkin measure. Model fit was considered acceptable if the χ2/df was less than 3 (Kline, 2005), the standardized root mean square residual was below 0.10 (Yan et al., 2023), and either the goodness-of-fit index exceeded 0.90 or the adjusted goodness-of-fit index exceeded 0.80 (Hair, 2006; Marsh et al., 1988). Values above 0.60 for composite reliability and above 0.50 for average variance extracted were considered indicative of acceptable validity (Nunnally & Bernstein, 1994; Zinbarg et al., 2005). The analysis was conducted using SPSS 17.0 and IBM® SPSS® Amos™ 21.0.
Qualitative and Quantitative Integration
The qualitative and quantitative integration (step 6 in Figure 1) aimed to adjust and optimize the initial GPR-MM. Based on the shortcomings of the quantitative findings, we revisited the qualitative results to review the initial model’s definitions, contextual factors and interrelationships of the (sub) themes and categories, adjusted and optimized the initial model accordingly, and validated the adjusted model using the confirmatory factor analysis to ultimately develop the final model.
Results
Included Sample
A total of ten GPR methods from nineteen articles (Agoritsas et al., 2017; Alexander et al., 2016; Alhazzani et al., 2019; Dewidar et al., 2023; Flemming et al., 2019; Guyatt et al., 2015; Guyatt et al., 2016; IOM, 2011; Klugar et al., 2022; Knaapen, 2013; Lewin et al., 2015; Lewin & Glenton, 2018; Loblaw et al., 2012; Oxman et al., 2006; Tugwell & Knottnerus, 2015; Vermeulen et al., 2019; Wang et al., 2020; Weiss et al., 2018; Wiercioch et al., 2020) and twelve manuals (AAOS, 2022; American Society of Clinical Oncology [ASCO], 2022; British Thoracic Society [BTS], 2022; Diekemper et al., 2018; European Society of Human Reproduction and Embryology [ESHRE], 2019a, 2019b; Lewis et al., 2014; Murad et al., 2011; NICE, 2020; SIGN, 2019; World Health Organization [WHO], 2014a, 2014b) were included. The PRISMA flow diagram (Page et al., 2021) is presented in Additional File 4. More detailed information about the articles and manuals is summarized in Additional File 5. The definitions or descriptions of GPR methods are presented in Additional File 6. A total of 210 CPGs were identified. The PRISMA flow diagram is presented in Additional File 7. Characteristics of the included CPGs are summarized in Additional File 8.
Qualitative Findings
The a priori framework was generated based on the GPR methods. It contained 3 themes, 10 subthemes, and 33 categories. The a priori framework is presented in Additional File 9. The 106 CPGs of group 1 were coded using the a priori framework. Two new categories (clinical question and remarks) were added. The a priori framework and two new categories constituted the initial model (Figure 3). It included 3 themes, 10 subthemes, and 35 categories. The relationships among the (sub) themes and categories in the initial GPR-MM are shown in Figure 3. The initial good practice recommendation methodological model. Note. CPGs: Clinical practice guidelines; GPR: Good practice recommendation; GPP: Good practice point; GPS: Good practice statement.
The Structure and Definitions of (Sub) Themes
Connotation (A.1)
Five subthemes were identified as essential attributes of the theme connotation: main types (A.1.1), purpose and role (A.1.2), formulation condition (A.1.3), scope of application (A.1.4), and supporting information (A.1.5).
Main types (A.1.1)
The subtheme of main types describes the categories of methods synthesized in the model, including good practice point, good practice statement, and consensus-based recommendation. Good practice point is intended to assist guideline users by providing recommendation that may not be evidence-based, but is considered essential to good clinical practice (BTS, 2022; ESHRE, 2019b; IOM, 2011; SIGN, 2019). Good practice statement represents a recommendation that guideline panels consider important, but not suitable for formal evidence grading. It applies when panels have high confidence that indirect evidence undoubtedly supports the net benefit and when collecting evidence would be an onerous and unproductive exercise and a poor use of the panels’ limited resources (Agoritsas et al., 2017; Alexander et al., 2016; Alhazzani et al., 2019; ASCO, 2022; Dewidar et al., 2023; Guyatt et al., 2015; Guyatt et al., 2016; Klugar et al., 2022; Tugwell & Knottnerus, 2015; Weiss et al., 2018; WHO, 2014a; Wiercioch et al., 2020). Consensus-based recommendation refers to the recommendation formed through consensus methods. Other methods for forming the recommendation based on consensus when evidence is weak or nonexistent are collectively referred to as consensus-based recommendation.
Purpose and Role (A.1.2)
The subtheme of purpose and role describes the importance and significance of GPR. The formation of GPRs complements evidence-based recommendations by helping to avoid inappropriate strong recommendations and by providing guidance for future researchers (ESHRE, 2019b; Vermeulen et al., 2019; Weiss et al., 2018). Developing GPR for specific clinical questions can reduce uncertainty in practice and improve the quality of care. (Agoritsas et al., 2017; BTS, 2022; ESHRE, 2019a; Murad et al., 2011; SIGN, 2019; Vermeulen et al., 2019). Developing GPR for specific non-clinical questions could facilitate the adoption of evidence-based recommendations (ESHRE, 2019b; SIGN, 2019; Wang et al., 2020).
Formulation Condition (A.1.3)
The subtheme of formulation condition describes the conditions that need to be met to develop GPR, including formulation condition and question condition. Evidence condition is subdivided into two subcategories: if no directly evidence is available, GPR can be formed; or, if low-quality evidence contradicts the guideline panels’ perception of clinical practice, GPR can be formed (AAOS, 2022; Agoritsas et al., 2017; Alexander et al., 2016; Alhazzani et al., 2019; ASCO, 2022; BTS, 2022; Dewidar et al., 2023; Diekemper et al., 2018; ESHRE, 2019a, 2019b; Guyatt et al., 2015; Guyatt et al., 2016; IOM, 2011; Klugar et al., 2022; Knaapen, 2013; Knaapen, 2013; Lewis et al., 2014; Loblaw et al., 2012; NICE, 2020; SIGN, 2019; Tugwell & Knottnerus, 2015; Vermeulen et al., 2019; Weiss et al., 2018; WHO, 2014a). Question condition is that the question that needs to form GPR should be clear, important, and come from areas of significant uncertainty (AAOS, 2022; BTS, 2022; Dewidar et al., 2023; ESHRE, 2019a; Guyatt et al., 2015; Guyatt et al., 2016; IOM, 2011; Knaapen, 2013; Lewis et al., 2014; Loblaw et al., 2012; NICE, 2020; Vermeulen et al., 2019; Weiss et al., 2018). Meanwhile, the question should be more practically oriented and needs to be addressed (Alhazzani et al., 2019; BTS, 2022; Dewidar et al., 2023; ESHRE, 2019a; Guyatt et al., 2015; Guyatt et al., 2016; Knaapen, 2013; Lewis et al., 2014; Loblaw et al., 2012; Vermeulen et al., 2019; Weiss et al., 2018; Wiercioch et al., 2020).
Scope of Application (A.1.4)
The subtheme of scope of application describes the main areas where GPR can be developed, including clinical practice areas and non-clinical practice areas. The category of clinical practice areas mainly involves diagnosis methods and treatment therapeutics (Dewidar et al., 2023; ESHRE, 2019a). The category of non-clinical practice areas mainly involves the ethical, social, legal (Guyatt et al., 2015; Weiss et al., 2018) and implementation aspects (Vermeulen et al., 2019; Weiss et al., 2018).
Supporting Information (A.1.5)
The subtheme of supporting information describes what information can be used to support GPR formation. This includes expert clinical experience and opinion, qualitative materials, and indirect evidence. If there is weak or no evidence to answer the question, experts provide their personal opinion or view, which can be used to support the specific GPR (AAOS, 2022; ASCO, 2022; BTS, 2022; ESHRE, 2019a; Knaapen, 2013; Murad et al., 2011; NICE, 2020; Oxman et al., 2006; Vermeulen et al., 2019; Weiss et al., 2018). The category of qualitative materials mainly includes qualitative research literature, legal precedence, government-related documents, existing practice standards, and ethical principles (ESHRE, 2019a; Flemming et al., 2019; IOM, 2011; Knaapen, 2013; Wang et al., 2020; Weiss et al., 2018; WHO, 2014b). Indirect evidence refers to the evidence that does not directly prove the effect of what is presented in the recommendation, but can be linked to other evidence to jointly prove the GPR’s validity (Dewidar et al., 2023; Guyatt et al., 2015; Guyatt et al., 2016; Murad et al., 2011; WHO, 2014a).
Procedure and Methods (A.2)
Three subthemes were identified in the theme procedure and methods: development procedure (A.2.1), development methods (A.2.2), and quality assessment methods (A.2.3).
Development Procedure (A.2.1)
The subtheme of development procedure describes the process of GPR development in a sequential and chronological manner, including defining the topic and scope, constructing groups, formulating the questions, determining to develop GPR, collecting information, integrating the collected information, preparing the draft, reaching consensus, completing the final draft, consulting stakeholders, getting approval, publishing and disseminating, and assessing the need for update.
Defining the topic and scope involves determining the main contents to be covered in CPGs (ESHRE, 2019a; Vermeulen et al., 2019). Based on development requirements of the CPG, guideline groups should be constructed, including steering committee, consensus group, working group, and external review group. Meanwhile, conflicts of interest of the group members should be considered. Formulating the questions of interest involves identifying specific elements, such as population, intervention, comparator, and outcome, and incorporating them into the prioritization process (Dewidar et al., 2023; ESHRE, 2019a). Determining to develop GPR requires careful consideration of both the formulation condition and the scope of application (Dewidar et al., 2023; Lewis et al., 2014; Loblaw et al., 2012; Vermeulen et al., 2019). Collecting information primarily involves gathering supporting information for forming GPRs (Diekemper et al., 2018; ESHRE, 2019a; Lewis et al., 2014; Loblaw et al., 2012; Murad et al., 2011). Integrating the collected information involves using appropriate methods to integrate it and provide reference material for subsequent consensus process (Dewidar et al., 2023; ESHRE, 2019a). Preparing the draft involves forming the initial draft of GPRs using the integrated information and following the reporting guideline (ESHRE, 2019a; Vermeulen et al., 2019). Reaching consensus involves inviting experts to reach consensus based on the prepared draft (ESHRE, 2019a; Lewis et al., 2014; Loblaw et al., 2012; Murad et al., 2011; Vermeulen et al., 2019). Completing the final draft involves refining the initial draft based on consensus results and following reporting guideline to produce the final draft (ESHRE, 2019a; Loblaw et al., 2012; Vermeulen et al., 2019). Consulting stakeholders refers to inviting stakeholders to review and refine the final draft (ESHRE, 2019a; Lewis et al., 2014; Vermeulen et al., 2019). Getting approval refers to submitting the finalized draft to the relevant organization or institution for review and approval (ESHRE, 2019a; Vermeulen et al., 2019). Publishing and disseminating refers to releasing the approved documents through available channels (ESHRE, 2019a; Vermeulen et al., 2019). Assessing the need for updates involves conducting regular reviews and making evidence-based recommendations as new evidence emerges (ESHRE, 2019a).
Development Methods (A.2.2)
The subtheme of development methods describes the technical methods involved in the GPR development, including methods for collecting information, methods for integrating the collected information, and methods for going from integrated information to GPR. The category methods for collecting information refers to the techniques employed to gather supporting information, such as surveys and systematic information retrieval (ESHRE, 2019a; Lewis et al., 2014; Loblaw et al., 2012; Murad et al., 2011; Wang et al., 2020; WHO, 2014b). The category methods for integrating the collected information refers to approaches such as qualitative research, qualitative evidence synthesis, and linked evidence to effectively combine the gathered information (Dewidar et al., 2023; ESHRE, 2019a; Flemming et al., 2019; Guyatt et al., 2015; Guyatt et al., 2016; NICE, 2020; WHO, 2014a, 2014b). The category methods for going from integrated information to GPR include consensus approaches, criteria for achieving consensus, and the consideration of multiple factors during the consensus process, such as significant and clear net benefits, values and preferences, acceptability, cost, equity, and feasibility. (AAOS, 2022; Agoritsas et al., 2017; Alexander et al., 2016; Alhazzani et al., 2019; ASCO, 2022; Dewidar et al., 2023; Diekemper et al., 2018; ESHRE, 2019a; Guyatt et al., 2015; Guyatt et al., 2016; IOM, 2011; Klugar et al., 2022; Knaapen, 2013; Lewis et al., 2014; Loblaw et al., 2012; NICE, 2020; Tugwell & Knottnerus, 2015; Weiss et al., 2018; WHO, 2014a).
Quality Assessment Methods (A.2.3)
The subtheme of quality assessment methods describes the methods for evaluating the quality of supporting information, including quality assessment of qualitative research and quality assessment of qualitative evidence synthesis. Quality assessment of qualitative research refers to the use of appropriate methods to evaluate its methodological rigor and reliability (NICE, 2020; Wang et al., 2020; WHO, 2014b). Quality assessment of qualitative evidence synthesis involves using appropriate methods to evaluate the methodological rigor and reliability of the synthesized qualitative evidence (Flemming et al., 2019; Lewin et al., 2015; Lewin & Glenton, 2018; NICE, 2020; Wang et al., 2020; WHO, 2014b).
Reporting Guideline (A.3)
Two subthemes were identified as the essential attributes of theme reporting guideline: reporting content (A.3.1) and reporting format (A.3.2).
Reporting Content (A.3.1)
The subtheme of reporting content describes the required items to be reported, including the specific GPR, identifiers to distinguish from evidence-based recommendations, the rationale for the GPR, the clinical question, and remarks. The specific GPR is used to provide specific guidance for clinical practice (Dewidar et al., 2023). Identifiers are used to label GPR and distinguish them from evidence-based recommendations (Dewidar et al., 2023; ESHRE, 2019a; Guyatt et al., 2016). Rationale for the GPR includes the integrated supporting information and the results of the consensus (Alhazzani et al., 2019; BTS, 2022; Dewidar et al., 2023; Guyatt et al., 2016; Klugar et al., 2022; Weiss et al., 2018). Clinical question refers to the description of the specific clinical query or issue addressed by the corresponding GPR. The category of remarks refers to considerations relevant to the application of a specific GPR in clinical practice.
Reporting Format (A.3.2)
The subtheme of reporting format describes the reporting requirements for the presentation, including clear, short and facilitate identification. The category clear and short means that the specific GPR is concise and easy to understand (Alhazzani et al., 2019; ASCO, 2022; BTS, 2022; Dewidar et al., 2023; ESHRE, 2019a; Guyatt et al., 2015; Guyatt et al., 2016; SIGN, 2019; Weiss et al., 2018). Facilitate identification refers to the use of identifiers and specific presentation formats that help to quickly locate the GPR within CPG (ASCO, 2022; Dewidar et al., 2023; Diekemper et al., 2018; ESHRE, 2019a; Guyatt et al., 2015; Guyatt et al., 2016; Klugar et al., 2022; Murad et al., 2011; WHO, 2014a).
The Relationships Between (Sub) Themes and Categories
GPR is part of CPGs. The steps for its development are generally consistent with those for developing CPGs (NICE, 2020; SIGN, 2019; WHO, 2014b). Before determining to develop GPR, the steps outlined in the development procedure are consistent with those used in the development of CPGs. The need to form GPR is confirmed during the determining to develop GPR step. Subsequently, the development of GPR proceeds through the following steps: collecting information, integrating the collected information, preparing the draft, reaching consensus, and completing the final draft. The remaining steps are in line with the process of developing CPGs (Figure 3).
In relation 1, formulation condition and scope of application are the main factors to consider in the determining to develop GPR step (Figure 3). Relation 2 represents the following: In the collecting information step, the methods for collecting information could be used to collect the information that described in the supporting information. In the integrating the collected information step, the methods for integrating the collected information could be used to integrate the collected supporting information, while the quality assessment methods could be used to assess the quality of the collected supporting information (Figure 3). In relation 3, the category methods for going from integrated information to GPR provides methodological support for the reaching consensus step (Figure 3). Relation 4 states that the GPR could be reported according to the reporting guideline during the steps preparing the draft and completing the final draft (Figure 3).
Test Results
Comparison with the a priori framework, two categories were added. After coding 106 CPGs against the a priori framework, no new (sub) themes emerged (Additional File 10). After coding the third and fourteenth CPGs, each generated a new category: remarks and clinical question, respectively. Following the coding of the fortieth CPG, the category constructing groups was further enriched, with the addition of an external review group. No new categories emerged during the ensuing coding process. The saturation was reached. In sensitivity analysis, excluding six CPGs (presented in bold form in Additional File 8) did not affect the presence of any of the (sub) themes and categories in the initial GPR-MM, nor their complexity and relationships.
Quantitative Findings
The 104 CPGs in group 2 were coded against the initial GPR-MM. The reference numbers of CPGs in group 1 and 2 (totaling 210 CPGs) were extracted as quantitative data for analysis. The CPGs list is presented in Additional File 8. For three themes in the initial model, three confirmatory factor analysis models were built. Each analysis model included up to 18 categories, requiring a minimum of 90 samples. The quantitative data met the required sample size. The Kaiser–Meyer–Olkin values for the three themes were 0.66, 0.77, and 0.76, respectively. The p-values of Bartlett’s test of sphericity were all less than 0.001. The data met the requirements for confirmatory factor analysis. Throughout the analysis, the fit indices of the second-order models for connotation and reporting guideline were generally acceptable. For procedure and methods, although the fit indices were acceptable, the composite reliability and average variance extracted were below the ideal thresholds, suggesting that not all subthemes consistently reflected their underlying constructs. In addition, several categories within connotation, procedure and methods, and reporting guideline had factor loadings below 0.40. Detailed quantitative results were presented in Additional File 11. These findings indicate that adjustment and refinement of the initial GPR-MM was necessary.
Results of the Qualitative and Quantitative Integration
Adjustments to the Initial GPR-MM
Based on the results of the confirmatory factor analysis, we revisited the qualitative stage, identified instances of over-segmentation in the categories, and revised the initial GPR-MM through discussion. We made the following adjustments to the categories: (1) In the procedure and methods theme, the categories collecting information and integrating the collected information were consolidated into collecting and integrating the information. The category completing the final draft was merged into the reaching consensus. The category consulting stakeholders was merged into the getting approval. The categories publishing and disseminating and assessing the need for update were consolidated into publishing, disseminating and updating. The categories methods for collecting information and methods for integrating the collected information were consolidated into methods for collecting and integrating the information. (2) In the theme reporting guideline, the category clinical question was merged into the specific GPR. (3) The category remarks in the theme reporting guideline was merged into the non-clinical practice areas in the theme connotation. The reasons for these adjustments to the categories are set out in Additional File 12. Meanwhile, we optimized the names of the (sub) themes and categories in the model through discussion to more accurately express their meanings. The adjusted GPR-MM is shown in Figure 4. The adjusted good practice recommendation methodological model. Note. CPGs: Clinical practice guidelines; GPR: Good practice recommendation; GPP: Good practice point; GPS: Good practice statement.
The Structural Model Fit of the Adjusted GPR-MM
The validity of the adjusted GPR-MM structure was calculated using the second-order confirmatory factor analysis. The kurtosis coefficients of several variables within the connotation and the procedure and methods exceeded 20, indicating a significant deviation from normality in the data distribution. The estimation method used was asymptotically distribution-free estimation. For reporting guideline, the maximum likelihood estimation method was used.
Regarding connotation, the second-order model showed satisfactory fit indices: the χ2/df was 2.53, goodness-of-fit index was 0.87, adjusted goodness-of-fit index was 0.802, and root mean square error of approximation was 0.085, demonstrating construct validity. Factor loadings for each category ranged from 0.47 to 0.98, and for first-order factors from 0.62 to 0.98, all exceeding the 0.40 threshold. Correlation coefficients among categories, treated as free parameters, were all below 0.6, showing no strong correlations and aligning with the model’s theoretical assumptions (Figure 5). Composite reliability was 0.94 and average variance extracted was 0.76, indicating that all subthemes consistently measured their respective constructs. The confirmatory factor analysis model of connotation (standardized parameter estimates). Note. Good practice point: Good practice point; good practice statement: Good practice statement.
Regarding procedure and methods, the fit indices of the second-order model were acceptable: the χ2/df was 2.97, goodness-of-fit index was 0.87, adjusted goodness-of-fit index was 0.82, and root mean square error of approximation was 0.097, demonstrating that the construct was valid. The factor loadings of the categories ranged from 0.59 to 0.99, and those of the first-order factors ranged from 0.43 to 0.76, all exceeding 0.40 threshold. Additionally, the correlation coefficients among categories, estimated as free parameters, were all below 0.8, indicating no strong correlations. These findings were consistent with the theoretical assumptions of the model (Figure 6). The composite reliability reached 0.68, whereas the average variance extracted was 0.42, falling below the recommended threshold of 0.50. The confirmatory factor analysis model of procedure and methods (standardized parameter estimates). Note. GPR: Good practice recommendation.
Regarding reporting guideline, the fit indices of the second-order model were acceptable: the χ2/df was 2.74, the goodness-of-fit index was 0.98, the adjusted goodness-of-fit index was 0.931, and the root mean square error of approximation was 0.091. These results indicate that the construct was valid. Except for rationale for the GPR, the factor loadings of all categories ranged from 0.67 to 0.96, and those of the first-order factors were 0.98, 0.99, all exceeding 0.40. While the correlation coefficients among categories, specified as free parameters, were all below 0.4, indicating no strong correlations and supporting the theoretical assumptions of the model (Figure 7). The composite reliability was 0.99 and the average variance extracted was 0.78, indicating that all subthemes constantly measured their respective constructs. The confirmatory factor analysis model of reporting guideline (standardized parameter estimates). Note. GPR: Good practice recommendation.
Discussion
Main Findings
The GPR-MM was developed and validated through an exploratory sequential mixed methods design. It comprises 3 themes, 10 subthemes, and 28 categories. It explains what GPR is, how it is developed, and how it is reported. The relationships among various (sub) themes and categories provide guidance on applying the model specifically to the development of GPR.
Discussion of Qualitative Findings
Based on the best fit framework synthesis method, existing GPR methods were used to develop the a priori framework. Relevant CPGs were then mapped onto this framework to construct the initial GPR-MM. Best fit framework synthesis is a type of the qualitative evidence synthesis method that falls within the broader category of qualitative research. Saturation has become an important criterion for assessing the quality of qualitative research and justifying sample size (Guest et al., 2016). In the analysis, the coding of CPGs against the a priori framework reached saturation, indicating that the qualitative analysis was rigorous and methodologically robust.
The results of the sensitivity analysis showed that excluding CPGs developed by individuals did not affect the initial GPR-MM. Individually published CPGs may be less systematic and comprehensive than those published by organizations. However, their exclusion did not change the synthesis results, suggesting the robustness of the initial model (Carroll et al., 2013).
Discussion of Quantitative Findings
The validity results of the initial GPR-MM indicated that the model fit was acceptable across the three themes (connotation, procedure and methods, and reporting guideline). Although the model fit indicators met the criteria, the structural validity of procedure and methods was suboptimal, and some category-level factor loadings across themes did not reach the expected thresholds. The quantitative analysis did not fully support the structure of the initial model, indicating that further adjustment and optimization were needed. Although the structural validity of the model was not entirely satisfactory, the quantitative findings still provided valuable insights for adjusting and improving the model.
Discussion of the Qualitative and Quantitative Integration
Using both qualitative and quantitative data in a study without explicitly mixing the data derived from each is not enough to be a true mixed methods design (Creswell & Plano Clark, 2017). The first step in the qualitative and quantitative integration was to fully understand the shortcomings of the quantitative findings. Based on this, we revisited the qualitative findings and identified instances where the categories in the initial model had been over-segmented. Therefore, we integrated and adjusted the interrelated categories to make the model structure more concise and easier to understand (Figure 4). This reflected the integrative nature of mixed methods, where qualitative and quantitative research interacted and complemented each other.
The validated results of the adjusted GPR-MM showed acceptable model fit for the connotation, procedure and methods, and reporting guideline. Except for the category rationale for the GPR within the reporting guideline, all factor loadings of the subthemes and categories involved in the adjusted GPR-MM were above 0.4, consistent with the theoretical assumptions of the model (Ertz et al., 2016). The composite reliability values for connotation, procedure and methods, and reporting guideline were all greater than 0.6, indicating that all subthemes consistently measured their respective constructs. The average variance extracted values were greater than 0.5 for both connotation and reporting guideline, except for procedure and methods, which had the value of 0.42. Although one average variance extracted value was below 0.5, previous studies have suggested that the average variance extracted value may be a conservative measure (Fornell & Larcker, 1981; Lam, 2012). Some studies have suggested that an average variance extracted value greater than 0.36 may indicate an acceptable level of convergent validity (Fornell & Larcker, 1981). Based on this cut-off value, the convergent validity of all three themes was deemed acceptable. Accordingly, the final GPR-MM was validated. Specific descriptions of the (sub) themes and categories, along with an elaboration of their interrelationships within the final GPR-MM are provided in the Additional File 13.
Definition of GPR
Based on the connotation theme, the definition of GPR was developed. In the development of CPGs, when important and clearly formulated questions arise from areas that are characterized by substantial uncertainty, are practice-oriented, and urgently need to be addressed, and when no direct evidence is available or existing low-quality evidence contradicts the panel’s clinical understanding, a recommendation could be developed based on expert clinical opinion and experience, relevant qualitative data, or related indirect evidence. The recommendation is called GPR.
Comparison of the GPR-MM with Other Relevant Methods
Unlike the existing GPR methods, GPR-MM clarifies the definition of GPR, the procedure and methods involved in the formation of GPR and how GPR can be standardized and presented in CPGs. It also establishes the relationships among different elements within the model, providing a systematic path to formulate GPR in a scientific and rational way. In addition, the GPR-MM provides a multi-criteria decision-making mechanism for forming GPR by consensus based on the integrated information, that is, to reach consensus on GPR by comprehensively considering the balance between pros and cons, values and preferences, acceptability, cost, equity, and feasibility to ensure the rationality of GPR.
The procedure and methods in the GPR-MM provide a standardized process for the formation of GPR. It includes the constructing groups, reaching consensus, and getting approval, along with their associated methodologies, providing channels and methods for multiple organizations to participate in the formation of GPRs in CPGs.
The GPR-MM is highly generalizable. It can be flexibly adapted regardless of the size of the organization. CPG developers across organizations can determine the number of clinical questions to address and adjust participants number in different groups based on their available resources. This flexibility ensures efficient implementation of the GPR-MM in diverse organizational settings. Concurrently, GPR-MM does not conflict with existing GPR methods. Instead, it is based on the synthesis and optimization of these methods, thereby providing more systematic guidance for the establishment of GPR.
The GPR-MM is a generalized methodological model. It is not designed for a specific disease category. In its formation, the included GPR methods originated from different countries, such as Canada, the United Kingdom, etc., and the included CPGs covered a variety of medical fields, such as the immune system, cardiovascular, neurological, endocrine, infectious, gastrointestinal, and so on. In the future, interested professionals can collaborate with us to conduct disease-specific studies, leading to the development of a series of tailored models. To facilitate the application of GPR-MM, interested parties or organizations are encouraged to contact us via email for additional methodological support.
Contribution to the Field of Mixed Methods
Although mixed methods have been applied across various fields, to the best of our knowledge, the implementation of this method in constructing a methodological model in CPGs field is the first of its kind. The use of an exploratory sequential mixed methods design in this study not only enhances the validity and applicability of GPR-MM but also provides practical guidance for future research and practice in the development of other methodological models in CPGs field. In addition, this study demonstrates how qualitative and quantitative data complement each other in GPR-MM construction and validation, providing an example for future data analysis in the integration stage using an exploratory sequential mixed methods design.
Strength and Limitations
This study adopted an exploratory sequential mixed methods design. Compared with the single empirical method or qualitative research method, the mixed methods realize both exploration and demonstration, thus enhancing the reliability of the model (Brown et al., 2020; Molina-Azorίn, 2010).
In the quantitative data collection stage, the number of references per (sub)theme or category was used as quantitative data. This frequency-based quantification method is commonly used in mixed methods research (Sandelowski et al., 2009). While this approach highlights the importance and emphasis of each (sub)theme or category, it overlooks data richness, limiting interpretive depth. Future research will reference the richness assessment tool proposed by Ames et al. (2024) and, in the qualitative data transformation process, combine frequency with richness assessment to gain a more comprehensive understanding of the importance and quality of information within the (sub)theme or category in CPGs, thereby further enhancing the depth and interpretive power of quantitative analysis.
The specific application of GPR-MM is mainly reflected in CPGs, which also serve as a key source for enriching the model. For the collection of CPGs, we systematically searched databases and the guideline library of World Health Organization. Some relevant CPGs from other organizations may have been missed.
Conclusion
Employing an exploratory sequential mixed methods design, this study constructed and validated the GPR-MM using best fit framework synthesis and confirmatory factor analysis methods. The model standardizes the GPR methodology system from three aspects: the nature of GPR, the development process and methods, and the reporting standard, and provides methodological support for CPGs developers to formulate GPR scientifically when there is insufficient evidence. Meanwhile, it provides CPGs methodology experts a unified understanding of GPR and informs future research. However, we do not advocate the overuse of GPR. GPR can only be developed if it meets the requirements mentioned in the GPR-MM. With ongoing advances in medical research, once high-quality evidence becomes available for previously unsupported questions, recommendations should be developed based on an evidence-based approach.
Supplemental Material
Supplemental Material - Trust in Government or in Technology? What Really Drives Internet Voting
Supplemental Material for Mixed Methods Research in the Field of Clinical Practice Guidelines: The Case of Developing a Methodological Model for Good Practice Recommendation by Yangyang Wang, Luan Zhang, Amin Sharifan, Myeong Soo Lee, Takeo Nakayama, Yaolong Chen, and Hui Li in Journal of Mixed Methods Research.
Footnotes
Acknowledgments
We would like to thank the members of Guidelines International Network (GIN) Traditional Medicine working group and GIN Asia for their ongoing contribution to the study.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the Specific Research Fund for Chronic Disease Management of Guangdong Provincial Hospital of Chinese Medicine (YN2024MB016) and the Research Project of State Key Laboratory of Traditional Chinese Medicine Syndrome (QZ2023ZZ04). The funder had no role in the study design, collection, analysis or interpretation of the reports. The funder did not write the paper and had no role in the decision to submit the paper for publication.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
