Abstract
A specific and rational index system is key to scientific research evaluation. According to the characteristics and status of research-oriented hospitals in China, this study aimed to construct a comprehensive and methodical system for scientific research evaluation. Using bibliometric research, we sorted and refined indices for both domestic and international scientific research evaluation systems, established two-dimensional indices of input and output, and constructed the theoretical framework of evaluation after experts. The Delphi method was adopted to determine the evaluation indices at all levels, and the Analytic Hierarchy Process was used to calculate the weights of the indices at all levels. Twenty experts from different medical fields were involved in the 2 rounds study. Altogether, 7 primary, 14 secondary, and 37 tertiary indices were included in the evaluation system. A matrix was built to conduct the maximum eigenvalue, the consistency indices, and the consistency ratio of each expert in the survey. The index weight coefficients of the indices were calculated accordingly. The model exhibited high consistency, and the credibility of the results was verified. The evaluation system for research-oriented hospitals that we established had high specificity, credibility, and rationality. The evaluation system that we established combines some quantitative evaluation indicators, which are subsequently weighted according to their importance in the field of research-oriented hospital. Evaluation index system will provide the practical manner in the future for comparing the potential academic level and impact of research-oriented hospitals. Moreover, further verification, adjustments, and optimization of the system and indicators will be performed in follow-up empirical studies.
Keywords
Question-And-Answer Highlights
(1) What do we already know about this topic?
A specific and rational index system is key to scientific research evaluation. (2) How does your research contribute to the field?
This study aimed to construct a comprehensive and methodical system for scientific research evaluation. (3) What are your research’s implications toward theory, practice, or policy?
An evaluation system for research-oriented hospitals.
Introduction
A research-oriented hospital is an institution that integrates medical services, talent training, and scientific research, and has many scientific research resources, such as complete specialties, concentration of talent, and sophisticated equipment and more funds. 1 Compared to other hospitals, and therefore, efficiency evaluation is important for research-oriented hospitals.
As the main undertakers of medical technology innovation, these hospitals undertake the responsibility of researching, disseminating, and applying innovative medical knowledge. In 2020, the COVID-19 pandemic enabled hospitals and the health sector to receive widespread attention from society, and this situation has exposed the problems of the global medical system, the medical industry, and medical research. 2 While the evaluation of hospitals’ research performances has the benefit of revealing inefficient aspects of hospitals, it can also create several positive outcomes such as contributing to policy makers’ decision-making for enhancing the performance of science research and promote achievement application and transformation within research-oriented hospitals. 3 Thus, evaluation of scientific research is a crucial part of medical research management and hospital management, which primarily require the establishment of a comprehensive evaluation system. 4 In particular, evaluation is valuable for enhancing the performance of scientific research and promoting application of research outcomes and transformation within research-oriented hospitals.5,6
Researchers have established an index system for scientific research.7-11 Li and Hao 12 developed an evaluation index system for determining the academic impact of military medical scholars. Wu 13 created a quantitative medical technology evaluation system through a questionnaire survey within medical institutions to assess medical technologies. At present, China has established Science and Technology Evaluation Metrics (STEM) ranking in China by unofficial third-party organizations. 14 However, in the country, the evaluation of scientific research at hospitals has been done through a single index characterized by untargeted incomplete data, inaccuracy, and a tendency to be based on total output value.13,15,16 Thus, this study aimed to develop an evaluation index system through literature research based on the characteristics and status of research-oriented hospitals in China. 17 Subsequently, we sought to build a quantitative medical research evaluation system through a questionnaire survey using the Delphi method and set the weight coefficients of the identified indices through the analytical hierarchy process (AHP). 18 Furthermore, to verify the feasibility and effectiveness of the evaluation system, we conducted a case analysis based on technique for order of preference by similarity to ideal solution (TOPSIS) and data envelopment analysis (DEA) to evaluate the research efficiency of specialty in hospital H.
With the establishment of this evaluation system and further verification in research practice, we hope that hospitals would be able to conduct a better assessment of scientific research, promote its output, and provide policy support for the management of research-oriented hospitals.
Materials and Methods
Literature Research
We retrieved the literature on the scientific research evaluation of relevant medical colleges and universities that were available on Web of Science, PubMed, and Scopus, and then determined the frequency of statistical indices. 19 We referred to recognized medical field evaluation ranking indices and selected the most used ones. Thereafter, to establish an index pool, we combined the scientific research status and characteristics of research-oriented hospitals. 20
Constructing the Initial Framework
The evaluation should be conducted for performance rather than for output alone. A total of 10 experts were interviewed through conference discussions or telephone surveys to form the initial framework. The production theory (ie, the minimum input is used to obtain the maximum output) was used as a guideline. The established index pool was incorporated. In addition to theoretical induction and deduction and other normative research methods, we drafted the initial evaluation system to have the 2 dimensions of input and output. 21
Constructing an Evaluation Index System Using the Delphi Method
Selection of questionnaire survey experts
In the questionnaire survey, the accuracy and consistency of the results were related to the number of experts participating in the investigation. According to the principle of the Delphi method, to ensure the credibility and authority of results, the optimal number of experts investigated was 15–50. 22 Thus, we selected 20 experts according to the requirements of the Delphi method, and the selected experts were required to be familiar with hospitals or specialty managers. The experts in this study were selected mainly based on the following criteria: (1) from research-oriented hospitals that ranked top 5 in STEM, (2) vice president or head of a department in charge of research management, and (3) clinical specialty director, which, in the Chinese Medical Association and other national societies, are vice chairman and any higher ranks. 23
Questionnaire survey
We designed an expert consultation questionnaire based on independent expert opinions collected using the Delphi method. In the questionnaire, experts were asked to respond to the necessity, importance, and operability of each index on a five-point Likert scale (5 = very much, 4 = so more, 3 = generally, 2 = less, 1 = not at all). 24 Theoretical analysis, practical experience, understanding of the provided data, and intuition influenced the experts’ judgment, and the degree of influence on their judgment was divided into 3 scales (1 = small, 2 = medium, 3 = large influence). In addition, the experts’ suggestions regarding adjustments for the structure of the evaluation system by adding, deleting, or merging indices were noted.
Defining the evaluation system and indices
The positive coefficient for expert consultation was expressed by the questionnaire’s recovery rate. We used the formula RR = M/N (where RR is expert positive coefficient, M is experts participating in the evaluation, and N is the total number of experts), with higher positive coefficients indicating better results. A previous study proposed a 70% positive coefficient as a good result. 25 The degree of expert authority was expressed by the authority coefficient Cr, which is dependent on the expert’s decision and judgment for each item. Subsequently, we derived the arithmetic mean of the quantified value and the quantified value of familiarity as follows:
Cr = (Ca + Cs)/2 (where Ca is basis for judgment and Cs is familiarity)
Matrix of Expert Familiarity Quantification.
That is, the greater the Cr, the higher the degree of authority. If Cr ≥.70, then the expert’s authority coefficient was considered to be sufficiently high, and, thus, the result of the inquiry was scientific and representative. 26
The degree of coordination among expert opinions was important. Through calculation, we examined the differences in the evaluation of items by experts, as reflected by 2 indicators: coefficient of variance (CV) and coordination coefficient (Kendall’s W, W). Here, W ranged from 0 to 1, with larger values indicating better coordination. The calculation and correction formulas for W are expressed as follows
We performed 2 sets of comparison of the modified indices and subsequently constructed a comparison judgment matrix. Finally, according to the results of statistical analysis, if the mean values of the necessity indices and CV were ≥4 and ≤.15, respectively, then expert recognition of the indices was deemed high, and the index was retained. The corresponding values of≤ 3 or ≥.15 points imply that the indices required further revision to be set as the final indices. 27
Allocation of index weights using analytic hierarchy process (AHP)
We used R3.5.1 software to complete the AHP to determine the index weights. In constructing the AHP model, we referred to the suggestion of Saaty et al,
28
which used a scale of 1–9 to evaluate the relative importance of the 2 indices. Thereafter, we calculated the weights of each level and checked for consistency. In the comparison matrix, λ
max
is the maximum eigenvalue, and n is the matrix of the order of comparison. The eigenvector value corresponding to λ
max
of the judgment matrix represents the weight W′ of the importance of each factor of this level to a factor of the previous level. Finally, we calculated the normalized weight coefficient W using the following formula
We calculated the combined weight coefficient C
i
of each index by continuous multiplication and the comprehensive score index GI using the following formula
After normalizing the weight coefficients, we calculated the consistency indices (CIs). The calculation formula for CI is expressed as follows
The calculation formula for the consistency ratio (CR) is expressed as follows
The value of CI reflects the consistency of the matrix; that is, the smaller the value, the higher the consistency. The CR test items were relatively limited to determine whether there was any logical confusion. CR<.1 can be regarded as no logical confusion and as having acceptable weights. 29
Corresponding RI Values to the 1–9 Orders of the Matrix.
Results
Constructing the Initial Evaluation Index System
Using “hospital,” “Medical College,” “Medical University,” “scientific research,” “evaluation,” and “assessment” as keywords, we searched CNKI, Wanfang Data, VIP, CBM, WoS, PubMed, Scopus, and other databases. After removing the repetitive literature, 996 studies were obtained, and after excluding irrelevant literature, 141 studies were included. We sorted out and extracted the indicators of research performance by domestic scholars in China according to the literature frequency of the indicators. Subsequently, we counted the most commonly used indicators, as shown in Figure 1. Based on available studies, practical experiences in medical research, and results of the expert consultation, we drafted the initial evaluation system that contained 4 primary-, 14 secondary-, and 46 tertiary-level indices. Description of indicator frequency of scientific research evaluation index in China.
Characteristics of Experts and Distribution of Questionnaires
Basic Information Description of Consulting Experts (n = 20).
Establishment of the Evaluation System and Indices
Expert positive coefficient
The 2 rounds of questionnaires were answered by the 20 experts surveyed. The RR was found to be 100%, indicating extremely high levels of involvement and attention by the experts in the research project.
Expert authority coefficient
Distribution of expert authority coefficients.
Overall Authority Coefficient of Experts.
Consistency of expert consultation
Coordination Coefficient and Chi-Square Value for the two Rounds of Expert Consultation.
Concentration of expert opinions and coefficient of variation
Two Rounds of Expert Consultation on Assigning the Mean of Necessity and CV of Indices at All Levels.
The coordination coefficient from the second round of expert evaluation was higher than that in the first round. For the second round, the average value of necessity for all indices exceeded 4, whereas the CV was less than .15, except for five of the indices. These values indicate that the experts agreed on the index system. Therefore, the index system has good rationality and practicality.
Modifications to Some Indices
The revision of the indices was based on statistical results, combined with expert opinions. We deleted 2 indices with relatively low research quality: the number of municipal science and technology awards and that of newly approved funds by enterprises. In the definition of indices, advanced specifications were recognized at the national level. To highlight the research transformation, we revised “output” to “awards and patents.” Considering the significance of international exchange and cooperation on academic reputation, we added the indices of the following: “number of international academic conferences held,” “number of participants in international academic conferences,” and “number of international cooperation projects and funding” under the academic reputation index.
Establish hierarchical structure
To establish the goals, rules, and scheme layer relationships, a hierarchical structure, as shown in Figure 2, was developed and divided into 4 layers. The first layer was the target layer: the evaluation performance of scientific research. The second layer was the primary-level index layer (C
i
) with 7 evaluation dimensions. The third layer includes several aspects corresponding to each dimension and belongs to the secondary index layer (A
i
). The fourth layer included specific measurement indices for each aspect that belonged to the tertiary-level index layer (B
i
). Hierarchical structure of the research evaluation index system for hospitals.
Constructed the judgment matrix of all levels
Primary-Level Index Judgment Matrix and Weight G.
Weight value of indices
Normalized Weights of Indices at All Levels.
W: Index weight.
Discussion
According to production theory, we constructed an evaluation index system with 2 dimensions— research input and output. To establish a quantitative evaluation index system, we conducted a literature review, a two-round questionnaire survey of experts, statistical analyses, and credibility verification of the results through a consistency evaluation test. The final index included 7 primary-, 14 secondary-, and 37 tertiary-level indices relevant to research-oriented hospitals.
The statistical results showed that among the primary-level indices, the weight of human input was the highest, followed by infrastructural input. Meanwhile, the weight of the number of outstanding talents, graduate students, and post-doctorate staff of the tertiary-level indices was high, in line with strategic deployment by the Chinese government, such as the formulation of plans for training outstanding talent, employment of graduate students as the main research force, and establishment of post-doctoral mechanisms in universities. The scientific research platform was among the infrastructural inputs in the secondary-level indices with the highest weight. The highest-weighted tertiary-level indices relevant to infrastructural input were laboratory and sample banks, which implies laboratory support in research. Moreover, there is increasing importance given to sample banks for clinical research. At the project level, national projects ranked better than provincial projects based on the calculated weights. The total number of citations and the H index were given the highest weights, which reflected the trend of emphasizing “quantity” rather than “quality.” In summary, the higher the level of indicators, the higher the weight. Hospital health policy makers should realize that the evaluation of scientific research pays more attention to the level and quality of output. In addition, talent input is the key indicator, and talent level as the most important factor determines the performance of scientific research.
This system conformed to the status and development trend of scientific research in research-oriented hospitals, which is consistent with related theories and practices. Therefore, it is scientific, practical, and reasonable. Subsequently, we will carry out case study, adopt the optimized EDM model to evaluate the performance of multiple hospitals or hospital specialties by the index system, and find out the existing problems and improvement methods for scientific research performance.30,31
This study had some limitations. For the construction of the index system, the Delphi method was used. This method has a subjective problem primarily based on the opinions of experts. In the selection of indices, we referred to foreign evaluation indices as there is no clear domestic definition for some excluded indices. Therefore, the method of constructing the evaluation index system and the selection of indices need to be further investigated. Subsequently, we will conduct follow-up research, conduct annual evaluations, and perform a retrospective study in 3–5 years to explore the correlation analysis of measures and improve scientific research performance. These additional methods would verify the scientific and rational nature of the evaluation model.
The constructed evaluation index system of the scientific research performance of research-oriented hospitals is aligned with the development trend of science and technology in China and with relevant theories and practices. As a special institution, research-oriented hospitals in developing countries have experienced rapid growth in recent years. Thus, the results of this study can be applied to the scientific research evaluation of other hospitals and guide scientific research investment and resource allocation to improve the quality of scientific research.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Department of Science and Technology of Sichuan Province, 2021JDR0283.
