Sage Journals: Discover world-class research

Abstract

This article focuses on the critical limitations in current undergraduate major evaluation systems within higher education, notably the scarcity of systematic evaluation at the major level and the inadequacy of singular methodologies in comprehensively reflecting complex quality dimensions. To overcome these challenges, the study constructs a multi-dimensional evaluation framework encompassing teaching resources, faculty caliber, student development, and societal impact, organized hierarchically across four first-level, nine second-level, and forty third-level indicators. The core methodological innovation lies in a combined weighting approach that integrates the Analytic Hierarchy Process (AHP) for subjective expert judgment, the Entropy Weight Method (EWM) for objective data-driven variability analysis, and the Mean-Variance Method (MVM) for statistical dispersion evaluation. This approach optimally synthesizes the strengths of these individual methods, effectively mitigating subjective bias, data noise sensitivity, and discrete deviation inherent in single-method evaluations. Empirical validation using data from a science- and technology-focused university demonstrates that the combined weighting results exhibit superior consistency with external authoritative accreditations (National First-Class Undergraduate Majors and Engineering Education Accreditation), as rigorously confirmed by significant Spearman's rank correlation analysis. The research provides a scientifically rigorous and practically viable solution for higher education institutions seeking to establish robust, multi-dimensional evaluation systems, optimize major structures, and enhance the quality and effectiveness of undergraduate talent cultivation within the context of national higher education reform initiatives.

Plain Language Summary

Improving Undergraduate Major Evaluation: A Combined Approach Using Subjective and Objective Measures

Purpose this study developed a more balanced approach to evaluate undergraduate majors by combining subjective expert judgment with objective data analysis, addressing the limitations of single-method evaluations. Methods We constructed an evaluation indicator system covering four first-level indices: Faculty Team Indicators, Student-Related Indicators, Major Construction Indicators, and Teaching Quality and Curriculum Indicators, which were further divided into 40 third-level indicators. The weights of indicators were determined by integrating expert scoring, data variability analysis, and statistical dispersion assessment. This combined weighting method was validated using data from 44 majors at a science- and technology-focused university in China. Conclusions Results showed that majors receiving high comprehensive scores consistently aligned with those recognized as National First-Class Undergraduate Majors or accredited under Engineering Education Accreditation. This strong concordance with external authoritative accreditations confirms the method's reliability in identifying high-quality majors and diagnosing areas for improvement. Implications This research provides universities with a practical tool to enhance major quality through evidence-based resource allocation, major structure optimization, and support for national higher education reform initiatives.

Keywords

undergraduate major evaluation combination weighting method evaluation indicator system analytic hierarchy process entropy weight method mean-variance method

Introduction

The quality of higher education (HE) and the level of major development constitute core determinants influencing the efficacy of national innovation systems and international competitiveness. Within the policy context of the ongoing “Double First-Class initiative” and the comprehensive implementation of the “Double Ten-Thousand Plan,” China's higher education evaluation system is undergoing a paradigm shift from scale-driven expansion to quality-driven enhancement. Consequently, establishing a scientific, objective, and comprehensive evaluation framework for higher education is of paramount importance (Mok & Marginson, 2021; Xiong et al., 2022). However, traditional singular evaluation models often fall short of addressing contemporary governance demands. Critical research by Ajjawi et al. (2024) on “authentic assessment” theory highlights a widespread lack of innovation in current higher education evaluation methodologies and underscores the urgent need to establish a multidimensional, quantitative evaluation system—particularly at the major level (Ajjawi et al., 2024). As the fundamental units of talent cultivation in higher education, majors require quality evaluations that integrate multidimensional data spanning teaching resources, faculty competence, student development, and societal impact. Yet, existing research predominantly focuses on the institutional or individual teacher levels (Bergsmann et al., 2015; X. Yang et al., 2022), creating a significant gap in systematic major-level evaluation. This gap is particularly critical to address, as major-level evaluation introduces distinct managerial challenges, such as guiding intra-institutional resource allocation, managing competition among majors, and balancing standardization with disciplinary uniqueness. The complexity of evaluating major development levels is manifested in three aspects. First, evaluations must encompass diverse elements like student development, societal demands, and major distinctiveness. Second, a multidimensional framework is needed, concurrently considering inputs, processes, outputs, and impact. Furthermore, research shows systematic variations in educational attributes across majors, making uniform criteria problematic (Thanassoulis et al., 2017). This necessitates hierarchical indicator structures and flexible weighting mechanisms. Finally, data heterogeneity requires integrating subjective qualitative evaluations and objective quantitative metrics. This inherent complexity renders any singular evaluation methodology inadequate for comprehensively and accurately reflecting a program’s overall standing. More critically, it challenges prevailing mainstream evaluation paradigms on a global scale. For instance, international large-scale assessments (e.g., PISA, TIMSS), while providing comparable data, often reduce multidimensional educational quality to a single dimension through their ranking logic, attracting widespread critique of “assessment reductionism” and “data fetishism” (Addey et al., 2017; Appels et al., 2024). At the national level, examples such as the selective compliance observed in Portugal when aligning with the European Standards and Guidelines (ESG) (Cardoso et al., 2015), or the strategic responses adopted by disadvantaged community schools in Chile to the unfairly applied ratings of the national educational quality assessment system (SIMCE) due to contextual disparities (Contreras et al., 2024), reveal a profound disconnection between rigid evaluation frameworks and diverse local practices. This disconnection leads to evaluation that are either superficial or produce perverse incentives. Therefore, developing a new evaluation framework requires a core mission beyond methodological integration: to construct an evaluation mechanism capable of accommodating pluralistic values, adapting to different institutional contexts, and achieving a dynamic balance between standardization and contextualization.

Literature Review

Through systematic retrieval of the Web of Science and CNKI core journals using keywords such as “higher education quality” and “undergraduate major evaluation,” seminal works were identified to synthesize relevant research findings concerning the construction of evaluation indicator systems and the application of evaluation methodologies. A comprehensive summary of the 29 studies included in this literature review is presented in Table 1.

Table 1.

Summary of Included Reviews.

First author	Year	Author country	Setting(s)	Key findings	Associated with this research
Weiyan Xiong	2022	China	China HE reform	Reviews China's HE reform outcomes	Background: policy context for HE evaluation in China
Ka Ho Mok	2021	China/Australia	China HE expansion	Analyzes massification/diversification trends	Background: HE development trends
Rola Ajjawi	2024	Australia/UK	Global HE assessment	Critiques authentic assessment frameworks	Problem identification: limitations of current evaluation methods
Xinyue Yang	2022	China	HE health evaluation	Uses AHP-EWM for indicator weighting	Problem identification: limitations of evaluation objects
Evelyn Bergsmann	2015	Austria	European HE competence teaching	Develops 3-stage evaluation concept	Problem identification: limitations of evaluation objects
Emmanuel Thanassoulis	2017	UK/Greece	Greek HE institution	Combines AHP-DEA for teacher evaluation	Problem identification: discipline-specific evaluation needs
Lies Appels	2024	International	International analysis of TIMSS data	Six global archetypes of educational quality	Problem identification: limitations of reducing multidimensional quality to a single metric
Camilla Addey	2024	Multi-country	International Large-Scale Assessment	Political economy of ILSA participation	Background: the dominance and critique of global evaluation paradigms
Sónia Cardoso	2015	Portugal	Quality assurance in higher education	Selective ESG compliance in Portuguese HE	Problem identification: disconnect between rigid assessment frameworks and localized practices
Paulina Contreras	2024	Chile	Assessment in disadvantaged schools	Disadvantaged community schools' strategic responses to standardized testing	Problem identification: contextual disparities distorting incentives from assessment outcomes
Research on Evaluation Indicator Systems
Amy Milsom	2015	USA	US college majors	Examines major (dis)satisfaction factors	Early research: single-dimension evaluation
Adam C. Carle	2009	USA	Online/F2F courses	Finds SETE stability across instruction modes	Early research: student evaluations
Zhang Yi	2023	China	Chinese local universities	Proposes outcome-oriented evaluation principles	Indicator system: hierarchical design
Qilei Wang	2022	China	Police colleges	Builds 4-subsystem evaluation framework	Indicator system: multi-level structure
He Jiang	2021	China	Chinese HE quality	Constructs government-society-university index system	Indicator system: comprehensive coverage
Reema Harrison	2022	Australia	Global HE teaching	Meta-review of quality evaluation dimensions	Indicator system: multi-dimensional framework
Quang Hung Do	2024	Vietnam	Vietnamese universities	Uses fuzzy AHP-TOPSIS for lecturer evaluation	Method reference: hierarchical weighting
Ao Ding	2023	China	National HE systems	Combines AHP-TOPSIS-EWM with clustering	Method reference: hybrid model integration
Zhicheng Dai	2021	China	Smart learning environments	Applies AHP-FCE and GA-BP methods	Method reference: algorithm fusion
Wei Xia	2021	China	Chinese universities	Develops AHP-based teacher ethics evaluation	Indicator system: multi-level structure
Research on the Application of Evaluation Methods
Xiaolong Deng	2024	China	Graduate education	Constructs “Five Educations” AHP system	Method reference: AHP application
Mónica García Melón	2008	Spain	Spanish polytechnic	Compares F2F vs computer-mediated AHP	Method limitation: AHP subjectivity
Yuchen Zhang	2021	China	Comparative HE systems	Uses entropy weight-TOPSIS	Method reference: objective weighting
Xiaoying Bai	2025	China	Innovation education	Improves EWM accuracy with correction coefficients	Method limitation: entropy weight bias
Cong Cao	2023	China	Nine-country HE comparison	PCA-EWM combined model	Method reference: combined model
Hao Qi	2022	China	Yangtze Delta universities	AHP-TOPSIS regional evaluation	Method reference: combined model
Qiongying Gu	2022	China	University teaching	BP neural network + random matrix model	Method innovation: integration of evaluation methods
Xin Xu	2022	China	Big data environments	“Trinity” multi-evaluation system	Method innovation: data-driven evaluation
Yue Yang	2024	China	Energy distribution	AHP-TOPSIS-RSR hybrid model	Cross-domain reference: combined weighting
Chen Bin	2019	China	Power grid equipment	PCA-AHP with relative entropy	Cross-domain reference: combined weighting
Henan Wu	2022	China	Student learning	AHP-fuzzy comprehensive evaluation	Method reference: subjective and objective mixed model
Zhiyong Xiao	2024	China	Vocational education	PDCA cycle + hybrid weighting	Method reference: dynamic evaluation

Research on Evaluation Indicator Systems

As synthesized in Table 1, research on evaluation indicator systems has evolved from single-dimensional to multi-dimensional approaches. Early studies primarily focused on singular dimensions, such as teaching effectiveness or student satisfaction (Carle, 2009; Milsom & Coughlin, 2015). In contrast, recent research tends to construct comprehensive indicator systems encompassing inputs, processes, and outputs (Jiang & Cao, 2021; Wang, 2022; Zhang et al., 2023). Meta-analytic research by Harrison et al indicates that high-quality educational evaluation should encompass multiple facets, including teaching practices, learning outcomes, resource conditions, and societal impact (Harrison et al., 2022). This perspective resonates with the “authentic assessment” framework proposed by Ajjawi et al. (2024).

Regarding the structure of indicator hierarchies, researchers commonly adopt a hierarchical design approach. Do et al. in their research on performance evaluation of university faculty in Vietnam, established a multi-layered evaluation framework encompassing main criteria and sub-criteria (Do et al., 2024). This hierarchical methodology has also been applied by Wang and Dai et al. in the evaluation of smart learning environments and higher education systems (Dai et al., 2021; Ding et al., 2023). Furthermore, the teacher professional ethics performance evaluation indicator system constructed by Xia and the professional competence training evaluation system for police colleges proposed by Wang et al. both employed a three-level indicator structure (Wang, 2022; Xia, 2021), demonstrating the effectiveness of multi-level indicator systems in addressing complex evaluation problems. Nevertheless, existing indicator systems still exhibit shortcomings in professional evaluation applications: (1) balancing comprehensiveness and practicality in the number of indicators remains difficult; (2) the integration of qualitative and quantitative indicators is often inadequate; and (3) correlations between indicators receive insufficient consideration.

Research on the Application of Evaluation Methods

As summarized in Table 1, evaluation methodologies have progressed from singular approaches to integrated frameworks. Numerous researchers have explored the applicability of single methodologies in evaluating higher education. The Analytic Hierarchy Process (AHP) method has gained widespread adoption due to its explicit hierarchical structure. For instance, Deng et al. utilized AHP to determine the weights for dimensions encompassing Moral, Intellectual, Physical, Aesthetic, and Labor Education (“Five Educations in Parallel”) within their constructed postgraduate evaluation system (Deng et al., 2024). However, Melón et al. (2008) found that the pure AHP approach, when applied in group decision-making contexts, can be susceptible to instability stemming from divergences in expert opinions (Melón et al., 2008). Additionally, the Entropy Weight Method (EWM) has been introduced into higher education major evaluation for its characteristic objectivity (Zhang, 2021). However, Bai and Wan (2025) observed that this method, when applied to innovation and entrepreneurship education evaluation, exhibits an excessive reliance on the degree of data dispersion (Bai & Wan, 2025). While each singular method offers distinct advantages, their significant limitations have motivated a shift toward integrated methodological approaches.

To overcome the limitations of single-method models, researchers are increasingly developing integrated evaluation frameworks that combine multiple technical approaches. For example, Cao et al. innovatively combined Principal Component Analysis (PCA) with the EWM to construct a quantitative model assessing higher education quality and sustainability (Cao et al., 2023). This approach simultaneously achieved dimensionality reduction of indicators while preserving core informational content. Similarly, Qi et al. employed an AHP technique for order of preference by similarity to ideal solution (TOPSIS) hybrid method to conduct stratified evaluations of higher education institutions within the Yangtze River Delta region (Qi et al., 2022). This involved determining indicator weights using the AHP, followed by multi-dimensional comparative ranking via the TOPSIS. The paradigm innovation within this line of research is further exemplified by the integrative application of methodologies. For instance, Gu fused Backpropagation (BP) Neural Networks with a stochastic matrix algorithm to develop a dynamic evaluation model for teaching quality (Gu, 2022). The model’s prediction accuracy of 94.9% validated the superiority of the hybrid algorithmic approach. Furthermore, Xin et al. (2022) established a “teaching-evaluation-learning-evaluation-effectiveness-evaluation” trifecta big data closed-loop evaluation system (Xin et al., 2022). This system effectively addressed the issue of data silos inherent in traditional singular evaluation methods through multi-source data fusion.

Recent research advances demonstrate that integrated subjective-objective weighting methods exhibit significant advantages in multi-faceted evaluation scenarios. Within the engineering and technical domain, Y. Yang et al. (2024) developed an integrated AHP-TOPSIS-RSR model for evaluating distributed photovoltaic grid integration (Y. Yang et al., 2024). This approach incorporated subjective weighting to reflect expert judgments regarding grid stability, while objective weighting captured the inherent characteristics of photovoltaic output data. The research empirically validated that this methodology effectively handles multi-dimensional indicators within complex systems. Analogously, Chen et al. integrated PCA and the AHP for power distribution equipment evaluation (Chen et al., 2019). By optimizing weight assignment using the relative entropy principle, they successfully resolved the typical issues of “excessive subjective bias” or “loss of data information” prevalent in traditional evaluations. These cross-disciplinary cases illustrate that the combination weighting method can systematically integrate domain-specific expertise with data-driven features.

Within the domain of higher education evaluation, this innovative model demonstrates unique value. Wu et al., addressing the inherent ambiguity in learning evaluation, employed an AHP-Fuzzy Comprehensive Evaluation (AHP-FCE) method to construct a multi-level indicator system (Wu et al., 2022). Their hierarchical weighting mechanism significantly enhanced the precision of undergraduate student learning outcome evaluation. Similarly, Xiao et al. (2024) integrated the Plan-Do-Check-Act (PDCA) cycle framework with combined subjective-objective weighting for teaching quality evaluation in vocational colleges. Their model incorporated a dynamic calibration mechanism, offering a novel solution for handling time-sensitive indicators in educational evaluation. Together, these applications demonstrate the promise of integrated weighting methods in higher education. Such approaches help overcome the limitations of traditional models, which often rely on fragmented indicators and uniform standards. Importantly, they preserve expert knowledge from subjective evaluation while leveraging the objectivity of data-driven algorithms, effectively addressing the complexity of multi-dimensional indicators. This dual capability provides robust methodological support for building multi-level major evaluation systems in higher education. It is especially suitable for integrating and analyzing the diverse data—such as teaching inputs, outputs, and efficacy outcomes—that are central to undergraduate major evaluation.

Research Content of This Research

As shown in Figure 1, the technology route of this research focuses on undergraduate major evaluation of universities within the framework of higher education evaluation. Addressing identified limitations in extant research—specifically, the scarcity of systematic evaluation at the undergraduate major and the inadequacy of singular evaluation methods in comprehensively and accurately reflecting the overall quality of undergraduate majors, we construct a multi-level evaluation system comprising 4 first-level indices, 9 second-level indices, and 40 third-level indices. These indicators encompass multidimensional facets, including teaching resources, faculty quality, student development, and societal impact, designed to address the inherent complexity characterized by multifaceted evaluation elements, diverse dimensions, and heterogeneous data types. The research methodology involves calculating weights for indicators at each level using the AHP, EWM, and Mean-Variance Method (MVM), respectively. This three-method fusion is proposed to overcome the observed limitations of existing hybrid models. For instance, while the AHP-EWM combination integrates subjectivity and objectivity, its EWM component remains sensitive to data noise; meanwhile, the prevalent AHP-TOPSIS model relies heavily on the initial AHP weights, potentially amplifying subjective bias. Our AHP-EWM-MVM model introduces MVM’s statistical dispersion perspective to establish a more robust triangular validation mechanism. Subsequently, a combination weighting method synthesizing the above approaches is applied to assign integrated weights to each indicator factor. Utilizing empirical data, comprehensive scores for individual majors are computed and subjected to comparative evaluation analysis. This facilitates a systematic comparative analysis of the weight distributions derived from different weighting methods and the resultant variations in major composite scores. Simultaneously, we verify whether majors with higher composite scores have obtained the status of national first-class undergraduate majors and passed the engineering education accreditation. This serves to examine the correlation between evaluation outcomes and the effectiveness of major development. This research proposes determining indicator weights for higher education major evaluation through a combination weighting method, integrating domain expertise with data-driven features. The core proposition is formalized as the following testable hypothesis: H1: The evaluation system constructed based on the combined weighting method demonstrates high concordance with external authoritative accreditations of undergraduate major quality. This approach mitigates subjective bias in traditional single-method evaluations and addresses data information loss, thereby providing methodological references and practical pathways for optimizing undergraduate major structures and establishing a scientific, comprehensive evaluation system.

Figure 1.

Technology route for the creation of this research.

Research Framework and Methodology

Construction of the Evaluation Indicator System for Undergraduate Majors

This research integrates relevant educational theories and practical requirements to construct a specialized evaluation indicator system for undergraduate majors. The overall objective is the evaluation of university undergraduate majors. The criteria level comprises four primary indicators: Faculty Team Indicators, Student-Related Indicators, Major Construction Indicators, and Teaching Quality and Curriculum Indicators. These primary indicators are further subdivided into nine secondary indicators. For instance, Faculty Team Indicators include “Talent and Professional Title-related” and “Teaching and Research Capabilities”, while Student-Related Indicators include “Admissions and Academic Performance” and “Comprehensive Competency and Employment.” Each secondary indicator is then divided into specific tertiary indicators, totaling 40 measurable elements. This number was determined through systematic literature analysis, expert consultation, and feasibility evaluation to balance comprehensiveness with practical data accessibility and operational cost. Examples include “National and Provincial-level Talents” and “Proportion of associate-professor or higher titles” under “Talent and Professional Title-related,” as detailed in Table 2. Data sources encompass the university's teaching management systems, professional development reports, student performance databases, and relevant educational statistics, ensuring the comprehensiveness, accuracy, and reliability of the data.

Table 2.

Undergraduate Major Evaluation Indicator System.

First-level indices	Second-level indices	Third-level indices
Faculty team indicators	Talent and professional title-related	National and provincial-level talents
		Proportion of associate-professor or higher titles
		Proportion of doctoral degrees
		Proportion of teachers with industry backgrounds
		National and Provincial Teaching Teams
	Teaching and research capabilities	Average number of published papers per person
		Average research funding per teacher
		Proportion of teachers involved in guiding student projects
		Proportion of teaching projects led by teachers
		Average number of textbooks compiled per person
Student-related indicators	Admissions and academic performance	First-choice enrollment rate
		Rate of major transfer
		Degree attainment rate
		Number of academic papers and patents published by students
		Converted score of excellent graduation theses
	Comprehensive competency and employment	Physical fitness test pass rate
		Proportion of students participating in entrepreneurship training programs
		Competition award scores
		Employment rate
		Student satisfaction
Major construction indicators	Professional characteristics and certification	National and provincial-level characteristic majors
		Engineering education accreditation
		Construction of first-class majors
		Majors selected into the Excellent Engineer Education and Training Program
	Teaching resources and implementation	Average value of experimental equipment per student
		Converted score of experimental teaching demonstration centers
		Converted score of high - quality courses
		Converted score of scientific research support platforms
		Score of internship and training bases
		Number of provincial and ministerial-level collaborative education projects
		Student-teacher ratio
		Proportion of professors teaching courses
Teaching quality and curriculum indicators	Teaching effectiveness evaluation	Satisfaction with core knowledge
		Paper pass rate
		Recognition of training objectives
		Proportion of compulsory course credits
	Practical and experimental teaching	Proportion of practical training
	Practical and experimental teaching	Proportion of experimental teaching
	Teaching achievements and honors	Teaching achievement awards
	Teaching achievements and honors	Outstanding Teacher Awards

Note. Operational definitions for representative third-level indicators:

-National and provincial-level talents: Quantitative count of full-time faculty recognized by specific high-level talent programs (e.g., academicians, National Science Fund for Distinguished Young Scholars recipients, Chang Jiang Scholars, provincial talent program participants).

-Average value of experimental equipment per student: Total net value of program-dedicated experimental equipment (CNY) divided by full-time undergraduate enrollment in the program. Data sourced from institutional asset management and student information systems.

Dataset and Evaluation Methodology

This research selected 44 majors from a science- and technology-focused university in China as the research subjects, establishing a comprehensive evaluation indicator system and conducting a systematic major evaluation. The data for this research were sourced from the university’s “2019 to 2022 University Undergraduate Teaching Quality Report” and its “Undergraduate Major Evaluation Center Database.” Subsequently, the aforementioned data were collated and standardized using the following formula: $X_{ij}^{'} = (X_{ij, \max} - X_{ij}) / (X_{ij, \max} - X_{ij, \min})$ , where $X_{ij}$ represents the original observed value for the i-th program on the j-th indicator, $X_{ij}^{'}$ denotes the normalized value, and $X_{ij, \min}$ and $X_{ij, \max}$ are the minimum and maximum values, respectively, of the j-th indicator observed across all programs. This process ensured all indicators were comparable on the same scale.

Following the determination of indicator weights, the composite score $Z_{i}$ for each major was calculated. The composite score represents the weighted sum of the scores across all indicators, computed according to the formula: $Z_{i} = \sum_{j = 1}^{m} W_{j} X_{ij}$ , where $Z_{i}$ denotes the composite score of the i-th major. This score holistically reflects the overall performance of each major across multiple evaluation metrics, thus serving as the basis for major ranking.

Determination of Subjective Indicator Weights Based on AHP

Principle of the Analytic Hierarchy Process

The Analytic Hierarchy Process (AHP), proposed by the American operational researcher T. L. Saaty, determines the subjective weights of indicators by decomposing a complex problem into multiple hierarchical levels, such as the objective level, criteria level, and indicator level. It involves constructing judgment matrices, calculating the relative importance weights of elements at each level, and conducting a consistency check to ensure the rationality of the weights. The formula for the judgment matrix consistency check is as follows:

R_{c} = \frac{I_{c}}{I_{R}} = \frac{(λ_{m a x} - n) / (n - 1)}{I_{R}}

λ_{\max} = \frac{1}{n} \sum_{i = 1}^{n} \frac{{(AW)}_{i}}{W_{i}}

Where $R_{c}$ represents the consistency ratio, $I_{c}$ denotes the consistency index value, $λ_{\max}$ is the maximum eigenvalue of the judgment matrix, n signifies the order of the judgment matrix, and $I_{R}$ refers to the average random consistency index value, as detailed in Table 3. When $R_{c}$ <0.1, the judgment matrix satisfies the consistency test.

Table 3.

The Values of I_R.

n	1	2	3	4	5	6
I_R	0	0	0.58	0.9	1.12	1.24

Construction of Judgment Matrices

This research constructed a total of 14 judgment matrices. Specifically, one judgment matrix was formulated for the four subsystems (first-level indices), and one judgment matrix was developed for each set of indicator factors beneath each of these four subsystems. The expert panel was meticulously selected based on three criteria: (1) over ten years of experience in higher education evaluation, (2) familiarity with the characteristics of science and engineering majors, and (3) substantial experience in major development and management. For the indicators at each hierarchical level, these experts were invited to pairwise evaluate the relative importance of the indicator elements using Saaty’s 1 to 9 scale method, thereby constructing the judgment matrices. Taking the four subsystems (first-level indices) as an example, experts compared and scored the relative importance of the Faculty Team Indicators, Student-Related Indicators, Major Construction Indicators, and Teaching Quality and Curriculum Indicators. This process yielded a comprehensive judgment matrix for these first-level indices.

A = (\begin{matrix} 1 & 1.5 & 1.48 & 0.9 \\ 0.67 & 1 & 1.5 & 1.48 \\ 0.68 & 0.67 & 1 & 1.5 \\ 1.11 & 0.68 & 0.67 & 1 \end{matrix})

Consistency Test

Consistency tests were performed on each judgment matrix using the consistency test formula to ensure that the matrices met the required consistency standards. The computational results demonstrate that the $R_{c}$ for all judgment matrices at each level was less than 0.1. For instance, the $R_{c}$ value for the first-level indices judgment matrix was 0.046538. These results indicate satisfactory consistency within each individual judgment matrices, confirming the reliability of the calculated weight values.

Determination of Subjective Weights

Following the aforementioned procedures, the subjective weights for the criteria level and the various indicators within the undergraduate major evaluation system were determined.

Determination of Objective Indicator Weights Based on the Entropy Weight Method

Principle of the Entropy Weight Method

The Entropy Weight Method is grounded in information entropy theory, which measures the degree of disorder or uncertainty in a system. In the context of undergraduate major evaluation, an indicator with high variability across majors possesses strong discriminatory power, corresponding to lower information entropy and a higher assigned weight. Conversely, an indicator with low variability carries less useful information, resulting in higher entropy and a lower weight.

Calculation Procedure

1) Data Normalization

Owing to the differing dimensions and scales among the various evaluation indicators, data normalization is essential to ensure comparability. We apply the following normalization formula:

x_{ij}^{'} = \frac{x_{ij, \max} {-x}_{ij}}{x_{ij, \max} {-x}_{ij, \min}}

where $x_{ij}$ represents the original observed value for the i-th program on the j-th indicator. $x_{i j}^{'}$ denotes the normalized value. $x_{ij, \min}$ and $x_{i j, m a x}$ are the minimum and maximum values, respectively, of the j-th indicator observed across all programs.

2) Calculate the Weight of Each Indicator

After standardization, calculate the weight $P_{ij}$ of each indicator j within each major i, using the formula: $P_{ij} = \frac{x_{ij}^{'}}{\sum_{j = 1}^{m} x_{ij}^{'}}$ , where n is the number of majors. This weight $P_{ij}$ reflects the relative importance of major i on indicator j, and it satisfies $\sum_{i = 1}^{n} p_{ij} = 1, j = 1, 2 \dots, m$ .

3) Calculate the Entropy Value of Each Indicator

According to information entropy theory, information entropy measures the degree of disorder or uncertainty in information. In multi-indicator evaluation, the information entropy $e_{j}$ of indicator j is calculated as: $e_{j} = - k \sum_{i = 1}^{n} p_{ij} \ln (p_{ij}),$ where $k = \frac{1}{\ln (n)}$ and $e_{j}$ is the entropy value of the j -th indicator.

4) Calculate the Divergence Coefficient of Each Indicator

To more intuitively reflect the relative importance of each indicator, the divergence coefficient $g_{j}$ is introduced, calculated as: $g_{j} = 1 - e_{j}$ , A larger divergence coefficient $g_{j}$ indicates greater variability of that indicator across different majors and, consequently, a larger impact on the major evaluation.

5) Calculate the Weight of Each Indicator

Based on the divergence coefficient, calculate the weight $w_{j}$ of each indicator using the formula: $w_{j} = \frac{g_{j}}{\sum_{j = 1}^{m} g_{j}},$ This ensures that the weights satisfy $\sum_{j = 1}^{m} w_{j} = 1$ .

Determination of Objective Indicator Weights Based on the Mean-Variance Method

Principle of the Mean-Variance Method

The fundamental concept of the Mean-Variance Method (MVM) stems from statistical variance theory. Variance acts as a metric to measure the extent of differences among data values. In the evaluation of undergraduate majors, when the data of an indicator vary widely among different majors, it implies that this indicator can effectively distinguish the strengths and weaknesses of various majors. As a result, such an indicator with a higher variance is deemed more informative and is assigned a greater weight in the evaluation system. On the contrary, if the data for an indicator shows minimal variation, it contributes little to differentiating majors, leading to a lower weight assignment. By calculating and normalizing these variances, MVM produces an objective, scientifically grounded weighting system. This process minimizes subjective influence, supporting more accurate and reliable evaluation outcomes.

Calculation Procedure

1) Calculate the Indicator Mean: For the j-th indicator, its mean value ${\bar{X}}_{j}$ is calculated using the formula: ${\bar{X}}_{j} = \frac{1}{n} \sum_{i = 1}^{n} X_{ij}$ , where n is the number of majors, and $X_{ij}$ represents the observed value of the i-th major on the j-th indicator. Calculating the mean reflects the average level of the indicator across all majors.

2) Calculate the Indicator Standard Deviation: The standard deviation $S_{j}$ measures the dispersion degree of the indicator data. Its calculation formula is: $S_{j} = \sqrt{\frac{\sum_{i = 1}^{n} {(X_{ij} - {\bar{X}}_{j})}^{2}}{n - 1}}$ , A larger standard deviation indicates greater variability of the indicator across different majors and a higher degree of data dispersion; whereas a smaller standard deviation signifies a lower degree of dispersion.

3) Calculate Indicator Weights: Based on the standard deviations, calculate the weight $W_{j}$ for each indicator using the formula: $W_{j}$ = $\frac{S_{j}}{\sum_{k - 1}^{m} S_{k}}$ , where m is the total number of indicators. This formula normalizes the standard deviation of each indicator, yielding its weight within the comprehensive evaluation and ensuring the sum of all weights equals unity $(\sum_{j = 1}^{m} W_{j} = 1)$ .

Calculation of Combined Weights

In undergraduate major evaluation, the scientific nature of the weighting system directly influences the reliability of the evaluation results. However, the exclusive use of subjective weighting methods, such as the AHP, while capable of incorporating expert experience and capturing logical relationships between indicators, is susceptible to subjective human biases, resulting in weights lacking objectivity and proving inflexible across differing national policy contexts. Conversely, objective weighting methods like the EWM and the MVM, while reliant on data and capable of determining weights based on indicator variability, may overlook the practical significance of indicators and their relevance to professional evaluation needs, leading to weights misaligned with talent development objectives. Therefore, this research integrates subjective weighting (AHP) and objective weighting methods (EWM, MVM) to assign combined weights to each evaluation indicator. This combined approach inherently provides a conflict-resolution mechanism. When significant discrepancies arise between subjective and objective weights for an indicator, the minimum entropy principle does not simply prioritize one source. Instead, it seeks an optimal compromise by minimizing the total information loss relative to both weight sets. This process effectively mitigates the influence of potential subjective bias or extreme data noise, resulting in a more balanced and robust weighting system.

Its adaptive logic operates at two levels. First, at the indicator level, the hierarchical system functions as an open architecture adaptable to context. For instance, within Germany's strong competency-based education tradition, tertiary indicators under Teaching Process Quality could be enhanced accordingly (Zlatkin-Troitschanskaia, 2021). In Nordic countries, where educational equity is a pronounced priority, observation points related to outcome disparities among student groups could be incorporated (Corral-Granados et al., 2025). Second, at the weighting level, the AHP-EWM-MVM mechanism serves as an adaptive engine. The AHP module internalizes locally relevant values by translating regional priorities—such as the emphasis on research impact versus teaching innovation—into weighted judgments through structured expert consultation. The EWM and MVM modules respond to the local data ecology by automatically reflecting the actual distribution and variation of institutional performance data within the region. The final composite weights, derived via the minimum variance method, thus represent a context-sensitive optimization that reconciles normative priorities with empirical evidence. Consequently, the same methodological framework could yield a weight structure emphasizing student satisfaction and employment outcomes in one national context, and curricular rigor and faculty research achievement in another, enabling a form of contextually intelligent evaluation.

Calculation Method

After obtaining the subjective weights ( $ω_{i}^{sub}$ ) and objective weights ( $ω_{i}^{obj}$ ) for each indicator in the undergraduate major evaluation, the principle of minimum entropy dictates that the combined weight ( $ω_{i}^{com}$ ) should be as close as possible to both the subjective and objective weights (Liu et al., 2022). Consequently, an optimization objective function is constructed using the minimum entropy method: $F (W_{com}) = \sum_{1}^{40} [ω_{i}^{sub} \ln \frac{ω_{i}^{sub}}{ω_{i}^{com}} + ω_{i}^{obj} \ln \frac{ω_{i}^{obj}}{ω_{i}^{com}}]$ , subject to the constraints: $\sum_{1}^{40} ω_{i}^{com} = 1$ . Optimization algorithms, such as the Lagrange multiplier method, are employed to solve this problem. The combined weight vector $W_{com}$ that minimizes the objective function $F (W_{com})$ is obtained by constructing the Lagrangian function, taking partial derivatives with respect to the weight variables and the Lagrange multiplier, setting these derivatives equal to zero, and solving the resulting system of equations.

Results Analysis

Evaluation Indicator Weight Analysis

Based on the aforementioned evaluation indicator system, weights were calculated using three methods: the Analytic Hierarchy Process (AHP), the entropy weight method (EWM), and the Mean-Variance Method (MVM). Combined weights were then computed employing the minimum entropy method. The resultant weights for each indicator are presented in Tables 4 to 6 and Figures 2 to 4.

Table 4.

Different Method Weight Values of the First-Level Indices of the Major Evaluation.

First-level indices	AHP	EWM	MVM	Combination weighting method
Faculty team indicators	0.2699	0.3088	0.1774	0.2565
Student-related indicators	0.2957	0.1559	0.2562	0.2509
Major construction indicators	0.2229	0.4012	0.3075	0.2886
Teaching quality and curriculum indicators	0.2115	0.1341	0.2589	0.2040

Table 5.

Different Method Weight Values of the Second-Level Indices of the Major Evaluation.

Second-level indices	AHP	EWM	MVM	Combination weighting method
Talent and professional title-related	0.1012	0.1147	0.1323	0.1122
Teaching and research capabilities	0.1488	0.1940	0.0401	0.1338
Admissions and academic performance	0.1156	0.0799	0.1568	0.1169
Comprehensive competency and employment	0.1344	0.0760	0.1282	0.1181
Professional characteristics and certification	0.1454	0.2434	0.1621	0.1739
Teaching resources and implementation	0.1046	0.1578	0.1790	0.1364
Teaching effectiveness evaluation	0.0973	0.0350	0.0793	0.0771
Practical and experimental teaching	0.0701	0.0161	0.0459	0.0505
Teaching achievements and honors	0.0826	0.0831	0.0763	0.0811

Table 6.

Different Method Weight Values of the Third-Level Indices of the Major Evaluation.

Third -level indices	AHP	EWM	MVM	Combination weighting method
National and provincial-level talents	0.0266	0.0294	0.0215	0.0298
Proportion of associate-professor or higher titles	0.0252	0.0078	0.0114	0.0181
Proportion of doctoral degrees	0.0222	0.0067	0.0110	0.0163
Proportion of teachers with industry backgrounds	0.0190	0.0159	0.0215	0.0219
National and Provincial Teaching Teams	0.0190	0.0340	0.0188	0.0254
Average number of published papers per person	0.0394	0.0192	0.0022	0.0227
Average research funding per teacher	0.0350	0.0476	0.0015	0.0322
Proportion of teachers involved in guiding student projects	0.0331	0.0478	0.0029	0.0319
Proportion of teaching projects led by teachers	0.0304	0.0644	0.0025	0.0351
Average number of textbooks compiled per person	0.0269	0.0895	0.0025	0.0386
First-choice enrollment rate	0.0387	0.0037	0.0221	0.0267
Rate of major transfer	0.0319	0.0064	0.0579	0.0350
Degree attainment rate	0.0273	0.0024	0.0227	0.0222
Number of academic papers and patents published by students	0.0233	0.0169	0.0086	0.0196
Converted score of excellent graduation theses	0.0189	0.0063	0.0091	0.0140
Physical fitness test pass rate	0.0186	0.0031	0.0145	0.0152
Proportion of students participating in entrepreneurship training programs	0.0244	0.0073	0.0162	0.0199
Competition award scores	0.0388	0.0060	0.0106	0.0210
Employment rate	0.0503	0.0024	0.0130	0.0235
Student satisfaction	0.0308	0.0020	0.0087	0.0153
National and provincial-level characteristic majors	0.0363	0.0709	0.0421	0.0516
Engineering education accreditation	0.0314	0.0588	0.1192	0.0620
Construction of first-class majors	0.0186	0.0592	0.0936	0.0439
Majors selected into the Excellent Engineer Education and Training Program	0.0209	0.1355	0.0484	0.0494
Average value of experimental equipment per student	0.0099	0.0206	0.0381	0.0199
Converted score of experimental teaching demonstration centers	0.0093	0.0363	0.0387	0.0216
Converted score of high - quality courses	0.0081	0.0419	0.0211	0.0181
Converted score of scientific research support platforms	0.0168	0.0276	0.0388	0.0274
Score of internship and training bases	0.0148	0.0191	0.0932	0.0342
Number of provincial and ministerial-level collaborative education projects	0.0136	0.0582	0.0247	0.0268
Student-teacher ratio	0.0121	0.0107	0.0112	0.0133
Proportion of professors teaching courses	0.0112	0.0162	0.0619	0.0248
Satisfaction with core knowledge	0.0318	0.0010	0.0160	0.0197
Paper pass rate	0.0183	0.0012	0.0259	0.0190
Recognition of training objectives	0.0222	0.0009	0.0086	0.0123
Proportion of compulsory course credits	0.0121	0.0007	0.0020	0.0023
Proportion of practical training	0.0246	0.0004	0.0059	0.0106
Proportion of experimental teaching	0.0362	0.0004	0.0038	0.0105
Teaching achievement awards	0.0266	0.0107	0.0116	0.0199
Outstanding teacher awards	0.0451	0.0107	0.0158	0.0285

Figure 2.

The weight distributions of the first-level indices of the major evaluation.

Figure 3.

The weight distributions of the second-level indices of the major evaluation.

Figure 4.

The weight distributions of the third-level indices of the major evaluation.

The analysis of the computational results indicates that each of the three individual weighting methods exhibits inherent limitations, such as subjective bias, data noise, and discrete deviation. The AHP method, reliant on expert subjective judgment, assigned a weight to the “Major Construction Indicators” (0.2229) that is significantly low. This poses a risk of deviation in the valuation of key indicators due to subjective evaluation discrepancies. The EWM calculates weights based on the degree of data variability, which is susceptible to interference from data noise and may thus fail to truly reflect the importance of indicators. This issue is specifically manifested in the obvious anomaly of the weight of “Average number of textbooks compiled per person” (0.0895) compared with the results calculated by other methods. The MVM, constrained by data discrete deviation, may lead to imbalanced weight allocation. This imbalance is particularly pronounced in the substantial disparity between the weights of the third-level indicators “Engineering education accreditation” (0.1192) and “Proportion of teachers involved in guiding student projects” (0.0029). In contrast, the combination weighting method serves as a robust conflict-resolution mechanism. Through mathematical optimization, it achieves an effective balance between subjective and objective weights. For instance, the high combined weight for “Engineering education accreditation” (0.0620) is a result of this balancing. The AHP weight (0.0314) reflects its strategic importance recognized by experts, while the high MVM weight (0.1192) captures its significant dispersion in the dataset. The combination method rationally integrates these two perspectives, yielding a weight that respects both expert judgment and objective data characteristics. When confronted with significant disparities or extreme values, this approach more accurately reflects the actual importance of each indicator within undergraduate major evaluation. Consequently, employing the combination weighting method to obtain indicator weights is of paramount significance for enhancing the quality of undergraduate major evaluation within higher education evaluation.

From the results of combined weighting, a clear hierarchical structure of indicator values is presented. At the level of first-level indices, the weight of “Faculty Team Indicators” is 0.2565, that of “Student-Related Indicators” is 0.2509, and that of “Major Construction Indicators” is 0.2886. These indicators have relatively high and close weights, indicating that all these aspects occupy a core position in undergraduate major evaluation. Among them, the slightly higher weight of “major construction” reflects its critical leading role in the overall development of the major. Within the second-level indices, “Teaching and Research Capabilities” (0.1338) demonstrates the highest weighting among faculty-related metrics, highlighting the critical nexus between enhancing faculty pedagogical and research competencies and major development. Correspondingly, among the Level 2 indicators concerning student development, the weight for “Comprehensive Competency and Employment” (0.1181) validates the substantial emphasis placed on cultivating students holistic competencies and tracking graduate employment outcomes. Delving deeper into third-level indices, metrics such as “Average number of textbooks compiled per person” (0.0386) and “Proportion of teaching projects led by teachers” (0.0351) carry significant weight within the faculty development sub-dimensions. This weighting pattern identifies them as key leverage points for elevating faculty quality standards. Within Level 3 indicators pertaining to student quality, “First-choice enrollment rate” (0.0267) and “Rate of major transfer” (0.0350) are considered effective measures for reflecting the major attractiveness.

Data analysis indicates that the development of majors in higher education institutions must be grounded in a high-caliber teaching faculty. Efforts should be directed toward introducing and cultivating high-level talents at the national and provincial levels, optimizing the structure of professional titles and academic degrees, and enhancing faculty members’ capabilities in teaching, research, and industry engagement. Concurrently, it is imperative to focus on the entire process of student development: improving the quality of student recruitment by enhancing the attractiveness of academic programs, strengthening academic guidance, practical training, and comprehensive quality cultivation, so as to promote students’ all-around development and high-quality employment.

Regarding major construction, institutions should prioritize the establishment of distinctive features through authoritative accreditation, increase investment in teaching resources, and optimize laboratory/practical training facilities alongside faculty allocation. The enhancement of teaching quality hinges critically on constructing an effective evaluation system, strengthening core curricula and practical teaching components, and stimulating the production of high-quality teaching outcomes. Overall planning must scientifically allocate resources based on assigned indicator weights, promote the coordinated advancement of faculty development, student growth, major development, and teaching improvement, and establish a dynamic adjustment mechanism to accommodate evolving societal demands.

In summary, this research employs a combined weighting method to optimize indicator weight assignment, thereby facilitating institutional efforts towards the continuous refinement of quality assurance systems and the innovation of dynamic regulation mechanisms for the major evaluation. Higher education institutions should adopt a strategic foresight, grounding their endeavors in a high-caliber teaching faculty and prioritizing student development. By strengthening major construction and quality enhancement through scientific planning and dynamic adjustment, they can effectively contribute to the high-quality development of higher education.

Association Test

Self-evaluation serves as a critical instrument for monitoring the quality of major development. Through quantitative analysis of indicators encompassing teaching resources, student development, faculty caliber, and major construction, it systematically delineates the current state of major advancement, enabling the identification of strengths, diagnosis of weaknesses, and provision of evidence-based rationale for resource allocation. However, the validity of self-evaluation outcomes necessitates verification through external authoritative evaluation to ascertain whether they accurately reflect the actual effectiveness of major development. To establish objective benchmarks, this research utilizes two external authoritative certifications: “National First-Class Undergraduate Majors” and “Engineering Education Accreditation.” The composite scores for each major, calculated per the methodology in Section “Dataset and Evaluation Methodology,” are then subjected to association tests and comparative analysis.

The evaluation results (Table 7) indicate that majors such as Chemical Engineering and Technology, Materials Forming and Control Engineering, and Safety Engineering consistently rank highly across various methodologies. Notably, Chemical Engineering and Technology secured the top position under all four evaluation methods. This preeminence aligns well with its dual-accredited status as both a National First-Class Undergraduate Majors and an Engineering Education Accreditation recipient. Conversely, majors including Urban and Rural Planning, and Emergency Technology and Management consistently ranked lowest across all four methodologies. These majors generally lack authoritative external accreditation and demonstrate conspicuous deficiencies in critical indicators such as “Faculty Team Indicators” and “Student-Related Indicators.”

Table 7.

Comprehensive Evaluation Scores for Selected Majors at a Science- and Technology-Focused University in China.

Major	Scores (AHP)	Major ranking	Scores (EWM)	Major ranking	Scores (MVM)	Major ranking	Scores (Combination weighting method)	Major ranking
Safety Engineering	0.0177	3	0.0200	3	0.0222	6	0.4470	7
Material Forming and Control Engineering	0.0203	2	0.0226	2	0.0237	4	0.5155	2
Material Science and Engineering	0.0077	20	0.0084	19	0.0129	22	0.3035	18
Mining Engineering	0.0159	7	0.0181	7	0.0226	5	0.4596	6
Surveying and Mapping Engineering	0.0054	28	0.0061	27	0.0092	38	0.2094	39
Measurement Control Technology and Instruments	0.0035	40	0.0038	41	0.0097	37	0.2106	38
Vehicle Engineering	0.0111	10	0.0124	10	0.0169	15	0.3456	14
Urban Underground Space Engineering	0.0046	34	0.0050	34	0.0108	31	0.2295	35
Urban and Rural Planning	0.0029	43	0.0031	43	0.0071	44	0.1680	44
Road, Bridge and River Crossing Engineering	0.0038	38	0.0042	38	0.0114	30	0.2233	36
Geological Engineering	0.0054	29	0.0060	28	0.0116	28	0.2336	31
Electrical Engineering and Automation	0.0167	6	0.0184	6	0.0241	3	0.5112	3
Electronic Science and Technology	0.0064	25	0.0069	25	0.0147	18	0.3059	17
Electronic Information Engineering	0.0077	19	0.0083	20	0.0156	17	0.3194	16
Textile Engineering	0.0035	41	0.0041	39	0.0086	41	0.1910	42
Water Supply and Drainage Science and Engineering	0.0096	14	0.0107	14	0.0204	11	0.3710	12
Engineering Mechanics	0.0051	31	0.0057	30	0.0121	27	0.2329	32
Industrial Design	0.0038	39	0.0042	37	0.0105	34	0.2045	41
Light Source and Lighting	0.0069	24	0.0074	24	0.0103	35	0.2669	24
Process Equipment and Control Engineering	0.0069	23	0.0079	23	0.0106	32	0.2372	29
Chemical Engineering and Technology	0.0232	1	0.0261	1	0.0275	1	0.5658	1
Environmental Engineering	0.0105	12	0.0116	12	0.0214	9	0.3914	10
Mechatronic Engineering	0.0097	13	0.0107	13	0.0175	13	0.3575	13
Mechanical Design, Manufacturing and Automation	0.0173	5	0.0193	4	0.0219	7	0.4774	5
Computer Science and Technology	0.0175	4	0.0192	5	0.0262	2	0.5092	4
Building Environment and Energy Application Engineering	0.0085	16	0.0095	16	0.0135	20	0.2894	21
Architecture	0.0051	32	0.0055	31	0.0132	21	0.2467	28
Metallic Materials Engineering	0.0120	8	0.0135	8	0.0196	12	0.3990	9
Fine Chemical Engineering	0.0054	27	0.0059	29	0.0088	40	0.2319	33
Mineral Processing Engineering	0.0090	15	0.0103	15	0.0171	14	0.3234	15
Energy and Power Engineering	0.0042	36	0.0046	35	0.0078	42	0.2194	37
Agricultural Water Conservancy Engineering	0.0082	17	0.0094	17	0.0105	33	0.2594	25
Software Engineering	0.0081	18	0.0083	22	0.0127	25	0.2978	20
Biomedical Engineering	0.0056	26	0.0061	26	0.0124	26	0.2671	23
Data Science and Big Data Technology	0.0053	30	0.0053	32	0.0128	23	0.2508	27
Water Resources and Hydropower Engineering	0.0046	33	0.0050	33	0.0115	29	0.2369	30
Hydrology and Water Resources Engineering	0.0111	9	0.0125	9	0.0205	10	0.3855	11
Communication Engineering	0.0041	37	0.0041	40	0.0127	24	0.2516	26
Civil Engineering	0.0110	11	0.0122	11	0.0215	8	0.4068	8
Metallurgical Engineering	0.0073	22	0.0083	21	0.0163	16	0.2990	19
Emergency Technology and Management	0.0028	44	0.0030	44	0.0073	43	0.1842	43
Pharmaceutical Engineering	0.0029	42	0.0031	42	0.0089	39	0.2073	40
Resource Exploration Engineering	0.0076	21	0.0086	18	0.0144	19	0.2856	22
Automation	0.0042	35	0.0044	36	0.0099	36	0.2311	34

The data in Table 8 and the subsequent quantitative analysis robustly demonstrate the consistency between the evaluation results obtained via the combined weighting method and external accreditations. To quantitatively assess this alignment, we have supplemented the analysis with the Spearman’s correlation coefficients between the ranking of evaluation methods in this study and the authoritative external certifications within the established indicator system—namely, National First-Class Undergraduate Majors (NFCM), Engineering Education Accreditation (EEA), as well as the composite of the aforementioned two external certifications (Above). The corresponding results are presented in the Table 9. The results presented in the table allow for the following straightforward observations: (1) A consistent trend is evident between the major rankings and external certifications, with higher absolute values of ρ indicating a stronger degree of correlation and agreement. (2) All ρ values are negative (ranging from −.82 to −.62), where a higher absolute value corresponds to a more favorable major ranking (i.e., a higher level of major quality). With p-values far below .001, these results demonstrate a strong concordance with external evaluations, thereby validating that the evaluation results of this study align well with objective reality. (3) When comparing different external certifications, the composite measure incorporating all two certifications exhibits the largest absolute ρ value (−.82) and the smallest p-value (6.92 × 10⁻¹²), indicating the strongest agreement. The highly significant Spearman’s correlation results provide strong statistical evidence supporting our research hypothesis (H1), confirming that the combined weighting method yields a more comprehensive and scientifically robust evaluation model.

Table 8.

Comprehensive Evaluation Scores of the Top 10 and Last 10 Majors at a Science- And Technology-Focused University in China.

Major ranking	AHP	EWM	MVM	Combination weighting method
1	Chemical Engineering and Technology	Chemical Engineering and Technology	Chemical Engineering and Technology	Chemical Engineering and Technology
2	Material Forming and Control Engineering	Material Forming and Control Engineering	Computer Science and Technology	Material Forming and Control Engineering
3	Safety Engineering	Safety Engineering	Electrical Engineering and Automation	Electrical Engineering and Automation
4	Computer Science and Technology	Manufacturing and Automation	Material Forming and Control Engineering	Computer Science and Technology
5	Manufacturing and Automation	Computer Science and Technology	Mining Engineering	Manufacturing and Automation
6	Electrical Engineering and Automation	Electrical Engineering and Automation	Safety Engineering	Mining Engineering
7	Mining Engineering	Mining Engineering	Manufacturing and Automation	Safety Engineering
8	Metallic Materials Engineering	Metallic Materials Engineering	Civil Engineering	Civil Engineering
9	Hydrology and Water Resources Engineering	Hydrology and Water Resources Engineering	Environmental Engineering	Metallic Materials Engineering
10	Vehicle Engineering	Vehicle Engineering	Hydrology and Water Resources Engineering	Environmental Engineering
The last 10
35	Automation	Energy and Power Engineering	Light Source and Lighting	Urban Underground Space Engineering
36	Energy and Power Engineering	Automation	Automation	Road, Bridge and River Crossing Engineering
37	Communication Engineering	Industrial Design	Measurement Control Technology and Instruments	Energy and Power Engineering
38	Road, Bridge and River Crossing Engineering	Road, Bridge and River Crossing Engineering	Surveying and Mapping Engineering	Measurement Control Technology and Instruments
39	Industrial Design	Textile Engineering	Pharmaceutical Engineering	Surveying and Mapping Engineering
40	Measurement Control Technology and Instruments	Communication Engineering	Fine Chemical Engineering	Pharmaceutical Engineering
41	Textile Engineering	Measurement Control Technology and Instruments	Textile Engineering	Industrial Design
42	Pharmaceutical Engineering	Pharmaceutical Engineering	Energy and Power Engineering	Textile Engineering
43	Urban and Rural Planning	Urban and Rural Planning	Emergency Technology and Management	Emergency Technology and Management
44	Emergency Technology and Management	Emergency Technology and Management	Urban and Rural Planning	Urban and Rural Planning

Table 9.

Spearman’s Correlation Between Major Rankings and External Accreditations.

	NFCM	EEA	Above
ρ	−.6165	−.6793	−.8232
p	8.344 × 10⁻⁶	3.976 × 10⁻⁷	6.925 × 10⁻¹²

To further validate the efficacy of the combination weighting model in evaluating the student development dimension, a supplementary analysis was conducted using two key metrics: “Employment Rate” and “Student Satisfaction.” The results demonstrate a significant positive correlation between the comprehensive rankings derived from the combination weighting method and these two indicators. Among the top 10 ranked majors, the average employment rate was 95.2%, and the average student satisfaction was 92.4%. In contrast, the bottom 10 ranked majors had corresponding averages of 83.1% and 79.5%. This clear gradient proves that the combination weighting method effectively captures core information reflecting student achievement and experience. The evaluation results are thus not only highly consistent with external accreditations but also accurately diagnose the intrinsic quality of majors regarding student development, further confirming the model’s comprehensive advantage in integrating multi-dimensional information.

Synthesizing the above analysis, the self-evaluation results derived from the combined weighting method employed in this research demonstrate a high degree of concordance with external authoritative accreditations, as rigorously validated by significant Spearman’s rank correlations. Most importantly, these results provide strong empirical support for our research hypothesis (H1). This validates the reliability of the proposed method and offers an efficient and robust solution for calculating indicator weights in major evaluation under multi-source information fusion.

Conclusions and Recommendations

Conclusions

This study addresses the complex and multidimensional challenge of evaluating undergraduate major quality by developing a combined weighting model that integrates subjective expertise with objective data. The proposed methodology tackles the universal difficulty of balancing expert judgment with quantitative evidence, offering a robust solution applicable across diverse international contexts. Empirical evidence indicates that the AHP-EWM-MVM-based evaluation framework effectively synthesizes expert knowledge with data characteristics, and its results demonstrate significant consistency with external authoritative accreditations.

The deeper methodological contribution of this study lies in the distinct transferability embedded in its design. In response to the pervasive tensions in global higher education evaluation—such as those between quality and equity, standardization and contextualization, along with the potential distortive effects of high-stakes evaluation—this framework establishes a flexible and modular foundation for generating solutions. Its modular indicator system allows for the replacement and reprioritization of indicators based on national contexts. The core of its dynamic combined weighting mechanism ensures that evaluation weights are grounded in the specific educational ecology by facilitating a transparent and mathematical negotiation between local value judgments and local data reality.

The findings demonstrate that the combined weighting method effectively mitigates the limitations inherent in single-method evaluations. By synthesizing the domain-knowledge logic of AHP with the data-driven features of EWM and MVM, it provides a scientifically rigorous yet practical approach. Furthermore, the model implements the principles of authentic assessment. Its multidimensional indicator system—encompassing teaching resources, faculty caliber, student development, and societal impact—reflects a comprehensive view of educational quality. The integration of subjective and objective weighting embodies the principle of synthesizing multiple sources of evidence, while its capacity for contextualized evaluation aligns with the situated nature of authentic assessment.

Beyond systematically representing core quality dimensions, the model proves effective in diagnosing major development outcomes. Moreover, the methodological framework is inherently generalizable, allowing for the development of customized evaluation schemes tailored to various institutional and disciplinary settings. Consequently, the value of this research extends beyond providing an evaluation tool for Chinese universities; it contributes to the international higher education community a design rationale and an implementation pathway for constructing contextually adaptive evaluation systems in a scientific and systematic manner, thereby establishing a solid methodological foundation for institutions to optimize their major structures and enhance quality assurance systems.

Recommendations

To advance undergraduate major evaluation and improve program quality, this study proposes the following evidence-based recommendations. First, institutions should establish a classification-based evaluation mechanism, using the combined weighting model to build a closed-loop quality governance system of evaluation, diagnosis, and improvement. For example, longitudinal application of the model at one university identified persistently low-performing majors—such as Textile Engineering and Urban and Rural Planning—which showed clear weaknesses in core areas like Faculty Team and Student Development. These data-informed insights supported the decision to reduce enrollment in those majors, optimizing institutional resource allocation.

Second, the model should be applied dynamically through regular data and weight updates, enabling longitudinal tracking and timely major adjustments. Third, intelligent evaluation platforms that integrate a unified framework with disciplinary adaptability should be developed to embed combined weighting results into dynamic major management. Moreover, strategic resource allocation should prioritize high-weight indicators identified in our analysis. In the studied institution, for instance, resources should focus on supporting majors pursuing authoritative accreditations such as Engineering Education Accreditation, and on enhancing core faculty competencies under Teaching and Research Capabilities.

By implementing these steps, higher education institutions can concentrate resources on building distinctive, competitive major clusters; rationally optimize major structures in response to developmental and societal needs; and obtain evidence-based support for regional differentiation within initiatives such as China’s “Double First-Class” program. Collectively, these advancements facilitate the comprehensive enhancement of the overall quality of higher education, contributing to the global pursuit of excellence in teaching and learning.

Limitations and Ethical Considerations

While this study provides a robust methodological framework for major evaluation, it is essential to acknowledge its limitations and the ethical considerations inherent in implementing such a data-intensive system. Firstly, the empirical validation was conducted within a single science-and-technology-focused university, which, while validating the method's effectiveness in this context, necessitates caution when generalizing the specific indicator weights to other institutional types. More critically, the deployment of any quantitative evaluation system carries the risk of unintended consequences. A primary concern is the potential for “gaming” the indicators, where departments might prioritize improving metrics over genuine educational quality. Furthermore, the system exhibits an inherent bias towards easily quantifiable outputs, potentially marginalizing essential but harder-to-measure educational values such as critical thinking, ethical reasoning, and creativity.

To mitigate these risks, we propose several strategies. The evaluation results should be positioned primarily as a diagnostic tool for continuous quality improvement rather than a final judgment. For programs with anomalous results, a qualitative-quantitative verification mechanism, incorporating methods like peer review and site visits, should be activated. Finally, the indicator system itself must undergo dynamic evolution to better capture a more holistic representation of educational quality and adapt to shifting educational paradigms. Acknowledging these limitations and implementing these safeguards are crucial for the responsible and effective application of the proposed evaluation model.

Footnotes

Acknowledgements

We are very grateful to all those who have helped with the article.

ORCID iD

Zhen Dai

Ethical Considerations

This article does not contain any studies with human participants performed by any of the authors.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was sponsored by Humanities and Social Science Research Project of the Ministry of Education of China (Grant No. 20YJA880072) and Special Project on Science and Technology Strategy Research of Shanxi (202404030401047)

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author.

References

Addey

Sellar

Steiner-Khamsi

Lingard

Verger

(2017). The rise of international large-scale assessments and rationales for participation. Compare: A Journal of Comparative and International Education, 47(3), 434-452. https://doi.org/10.1080/03057925.2017.1301399

Ajjawi

Tai

Dollinger

Dawson

Boud

Bearman

(2024). From authentic assessment to authenticity in assessment: Broadening perspectives. Assessment & Evaluation in Higher Education, 49(4), 499–510. https://doi.org/10.1080/02602938.2023.2271193

Appels

De Maeyer

Van Petegem

(2024). Re-imagining educational quality. The need for a multidimensional approach in evaluating educational quality through TIMSS data. Studies in Educational Evaluation, 83, 101409. https://doi.org/10.1016/j.stueduc.2024.101409

Bai

Wan

(2025). Evaluation method of innovation and entrepreneurship education quality in colleges and universities based on entropy weight method. International Journal of Innovation and Sustainable Development, 19(1), 81–93. https://doi.org/10.1504/IJISD.2025.142907

Bergsmann

Schultes

M.-T.

Winter

Schober

Spiel

(2015). Evaluation of competence-based teaching in higher education: From theory to practice. Evaluation and Program Planning, 52, 1–9. https://doi.org/10.1016/j.evalprogplan.2015.03.001

Cao

Wei

Fang

(2023). Comprehensive evaluation of higher education systems using indicators: PCA and EWM methods. Humanities and Social Sciences Communications, 10(1), 1–12. https://doi.org/10.1057/s41599-023-01938-x

Cardoso

Tavares

Sin

(2015). The quality of teaching staff: Higher education institutions’ compliance with the european standards and guidelines for quality assurance—The case of Portugal. Educational Assessment, Evaluation and Accountability, 27(3), 205–222. https://doi.org/10.1007/s11092-015-9211-z

Carle

A. C.

(2009). Evaluating college students’ evaluations of a professor’s teaching effectiveness across time and instruction mode (online vs. Face-to-face) using a multilevel growth modeling approach. Computers & Education, 53(2), 429–435. https://doi.org/10.1016/j.compedu.2009.03.001

Chen

Liu

(2019). Hybrid subjective and objective evaluation method of the equipment for first class distribution network. Energy Procedia, 158, 3452–3457. https://doi.org/10.1016/j.egypro.2019.01.928

10.

Contreras

Santa Cruz

Assaél

Palma

Albornoz

Fernández

M. B.

Redondo

(2024). Re-contextualization of school quality assessment policies: An ethnography approach to SIMCE on chilean disadvantaged schools. Educational Assessment, Evaluation and Accountability, 36(1), 31–52. https://doi.org/10.1007/s11092-023-09425-2

11.

Corral-Granados

Rapp

A. C.

Smeplass

(2025). Nordic challenges related to exclusion and local responses in swedish, finnish, and norwegian urban compulsory education. Education Inquiry, 16(1), 37–68. https://doi.org/10.1080/20004508.2022.2163002

12.

Dai

Sun

Zhao

(2021). Assessment of smart learning environments in higher educational institutions: A study using AHP-FCE and GA-BP methods. IEEE Access, 9, 35487–35500. https://doi.org/10.1109/ACCESS.2021.3062680

13.

Deng

Zhang

Liu

(2024). Research on hierarchical dynamic AHP score evaluation system within the scope of graduate five education. In Proceedings of the Humanities, Education and Social Sciences. ELS Publishing. https://doi.org/10.55092/phess20240011

14.

Ding

Wang

(2023). A comprehensive evaluation mode for higher education based on AHP-TOPSIS-EWM and hierarchical clustering. In Proceedings of the 2022 5th International Conference on Education Technology Management (pp. 316–320). Association for Computing Machinery. https://doi.org/10.1145/3582580.3582635

15.

Q. H.

Tran

V. T.

Tran

T. T.

(2024). Evaluating lecturer performance in Vietnam: An application of fuzzy AHP and fuzzy TOPSIS methods. Heliyon, 10(11), e30772. https://doi.org/10.1016/j.heliyon.2024.e30772

16.

(2022). Research on teaching quality evaluation model of higher education teachers based on BP neural network and random matrix. Mathematical Problems in Engineering, 2022(1), 5088853. https://doi.org/10.1155/2022/5088853

17.

Harrison

Meyer

Rawstorne

Razee

Chitkara

Mears

Balasooriya

(2022). Evaluating and enhancing quality in higher education teaching practice: A meta-review. Studies in Higher Education, 47(1), 80–96. https://doi.org/10.1080/03075079.2020.1730315

18.

Jiang

Cao

(2021). The construction of undergraduate higher education quality evaluation system to promote the sustainable development of higher education. E3S Web of Conferences, 251, 2080. https://doi.org/10.1051/e3sconf/202125102080

19.

Liu

Tan

Yuan

Liu

(2022). Combination weighting-based method for access point optimization of offshore wind farm. Energy Reports, 8, 900–907. https://doi.org/10.1016/j.egyr.2021.11.121

20.

Melón

M. G.

Aragonés Beltran

Carmen González Cruz

(2008). An AHP-based evaluation procedure for innovative educational projects: A face-to-face vs. Computer-mediated case study. Omega, 36(5), 754–765. https://doi.org/10.1016/j.omega.2006.01.005

21.

Milsom

Coughlin

(2015). Satisfaction with college major: A grounded theory study. NACADA Journal, 35(2), 5–14. https://doi.org/10.12930/NACADA-14-026

22.

Mok

K. H.

Marginson

(2021). Massification, diversification and internationalisation of higher education in China: Critical reflections of developments in the last two decades. International Journal of Educational Development, 84, 102405. https://doi.org/10.1016/j.ijedudev.2021.102405

23.

Wang

Liu

(2022). Quantitative analysis of higher education based on AHP-TOPSIS. Journal of Physics: Conference Series, 2381(1), 12034. https://doi.org/10.1088/1742-6596/2381/1/012034

24.

Thanassoulis

Dey

P. K.

Petridis

Goniadis

Georgiou

A. C.

(2017). Evaluating higher education teaching performance using combined analytic hierarchy process and data envelopment analysis. Journal of the Operational Research Society, 68(4), 431–445. https://doi.org/10.1057/s41274-016-0165-4

25.

Wang

(2022). Designing an evaluation system to assess professional ability training in police colleges. Humanities and Social Sciences Communications, 9(1), 1–8. https://doi.org/10.1057/s41599-022-01395-y

26.

Sun

Zhao

(2022). Research on learning evaluation of college students based on AHP and fuzzy comprehensive evaluation. Computational Intelligence and Neuroscience, 2022(1), 9160695. https://doi.org/10.1155/2022/9160695

27.

Xia

(2021). Research on the performance evaluation indicator system of university teachers’ ethics in the new era based on AHP model. In Proceedings of the 2021 4th international conference on information systems and computer aided education (pp. 885–889). Institute of Electrical and Electronics Engineers. https://doi.org/10.1145/3482632.3483043

28.

Xiao

Yuan

Zou

Liu

Jiang

Qiu

Yin

(2024). The application of PDCA cycle theory combined with subjective and objective empowerment method in teaching quality assessment system of higher vocational colleges and universities. Applied Mathematics and Nonlinear Sciences, 9(1). https://doi.org/10.2478/amns-2024-1581

29.

Xin

Shu-Jiang

Nan

ChenXu

Dan

(2022). Review on a big data-based innovative knowledge teaching evaluation system in universities. Journal of Innovation and Knowledge, 7(3), 100197. https://doi.org/10.1016/j.jik.2022.100197

30.

Xiong

Yang

Shen

(2022). Higher education reform in China: A comprehensive review of policymaking, implementation, and outcomes since 1978. China Economic Review, 72, 101752. https://doi.org/10.1016/j.chieco.2022.101752

31.

Yang

(2022). Higher education evaluation system based on AHP & EWM. Advances in Educational Technology and Psychology, 6(10), 75–93. https://doi.org/10.23977/aetp.2022.061012

32.

Yang

Zheng

Cheng

Zhu

(2024). Comprehensive evaluation o distributed PV grid-connected based on combined weighting weights and TOPSIS-RSR method. Energy Engineering, 121(3), 703–728. https://doi.org/10.32604/ee.2023.044721

33.

Zhang

(2021). Evaluation model of higher education based on entropy weight method. Transactions on Comparative Education, 3(3), 40–44. https://doi.org/10.23977/trance.2021.030310

34.

Zhang

Yang

Jia

(2023). Research on key factors of undergraduate professional evaluation in local universities: A case study of T University. Theory and Practice of Education, 43(30), 9–13.

35.

Zlatkin-Troitschanskaia

(2021). Advances and perspectives of competence research in higher education – report on the german KoKoHs program. International Journal of Chinese Education, 10(1), 1–12. https://doi.org/10.1177/22125868211006205

Optimizing Undergraduate Major Evaluation in Higher Education: A Combined Weighting Method Integrating Subjective Expertise and Objective Data

Abstract

Plain Language Summary

Keywords

Introduction

Literature Review

Research on Evaluation Indicator Systems

Research on the Application of Evaluation Methods

Research Content of This Research

Research Framework and Methodology

Construction of the Evaluation Indicator System for Undergraduate Majors

Dataset and Evaluation Methodology

Determination of Subjective Indicator Weights Based on AHP

Principle of the Analytic Hierarchy Process

Construction of Judgment Matrices

Consistency Test

Determination of Subjective Weights

Determination of Objective Indicator Weights Based on the Entropy Weight Method

Principle of the Entropy Weight Method

Calculation Procedure

1) Data Normalization

2) Calculate the Weight of Each Indicator

3) Calculate the Entropy Value of Each Indicator

4) Calculate the Divergence Coefficient of Each Indicator

5) Calculate the Weight of Each Indicator

Determination of Objective Indicator Weights Based on the Mean-Variance Method

Principle of the Mean-Variance Method

Calculation Procedure

Calculation of Combined Weights

Calculation Method

Results Analysis

Evaluation Indicator Weight Analysis

Association Test

Conclusions and Recommendations

Conclusions

Recommendations

Limitations and Ethical Considerations

Footnotes

Acknowledgements

ORCID iD

Ethical Considerations

Funding

Declaration of Conflicting Interests

Data Availability Statement

References