Abstract
This paper reviews a set of considerations for evaluating academic units in complex universities and higher education systems. Methods of evaluation should match evaluative goals. Assessment should be sensitive to context and recognize the realities of contemporary higher education. Because a one-size-fits-all evaluative regime may be inappropriate, a capabilities-based approach is advanced. The capabilities-based approach focuses on meeting an overall goal of designing higher education systems that meet social demands.
Introduction
Evaluating academic units within universities is difficult. To evaluate the performance of an academic unit one must identify goals (what should it be doing?), devise some way to measure goal attainment, and, in many cases, establish a meaningful comparison group. Even more critical, the evaluator needs to have a specific aim. There is no point to assess for assessment’s sake; instead, an assessment should further the ultimate goals of a university or a higher education system.
The purpose of this conceptual article is to analyze how the evaluation of academic units can further the goals of individual universities, support strong higher education systems, and help higher education meet social demands. An underlying assumption of this analysis is that a singular conception of academic excellence predicated on research output and registered by premiant ranking systems is insufficient. National higher education systems can benefit from building world-class universities distinguished primarily by research eminence, but world-class universities are most effective when part of robust and capable higher education systems. As Altbach (2004) explains:
Putting too much stress on attaining world-class status may harm an individual university or an academic system. It may divert energy and resources from more important—and perhaps more realistic—goals. It may focus too much on building a research-oriented and elite university at the expense of expanding access or serving national needs. It may set up unrealistic expectations that harm faculty morale and performance.
With Altbach’s caution in mind, this essay advances the idea of a capabilities-based approach for evaluating academic units. The central tenant of a capabilities-based approach is that academic units can be assessed based on the extent to which they advance the capabilities of their students, the university and higher education system to which they belong, and the broad society. A capabilities-based approach is interested in assuring that academic units contribute to the realization of individual and social collective goals.
The analysis is conducted broadly so that it may be applicable to a wide set of units within different intuitional types and various national contexts. For the purposes of this essay, the term academic unit refers to the sub-unit within institutions to which individual academics are primarily affiliated. The name and scale of the unit will vary by country and institution. Examples include Colleges, Faculties, Departments, Centers, and Programs. Academic unit is an intentionally broad term used to be inclusive of different institution types and national variations. The underlying concept is that sub-units within higher education institutions do some defined work, such as deliver academic programs, conduct research in a disciplinary or topical domain, or engage in public service activity.
By adopting a broad method of analysis, a limitation of this essay is that it does not provide precise technical advice, nor does it systematically review existing quality assurance systems. The present analysis is not intended to be a technical document crafted to provide specific advice on how to evaluate academic units. Rather, the aim of this article is to develop the concept of capabilities-based evaluation and to identify some practical considerations for devising such a system.
Distinguishing a Capabilities-Based Approach from a Status-Based Orientation
Higher education institutions notoriously have multiple, sometimes competing, and often ambiguous goals. Given the uncertainty about what higher education organizations ought to do, they often attempt to satisfy social exercitations rather than optimize technical performance (Birnbaum, 1989). One common goal of assessment is to achieve excellence because all stakeholders can agree that higher education organizations should be meritorious to earn social support. Excellence itself is an elusive concept, but nonetheless has motivated efforts to reform and improve higher education (Readings, 1996). Excellence is open ended. No individual university or higher education system can achieve the upper limit of excellence because improvements can always be made. Excellence is an absolute concept without singular definition. All individual universities or higher education systems can hypothetically be excellent. Just so, what it means to be excellent can legitimately differ from one assessor to the next. For these reasons, universities and higher education systems are commonly assessed by the concept of status, or prestige.
Status, although not necessarily zero-sum, is always relative because it is determined by the position of one university or system to another. Assessment and status go hand-in-glove. Status, typically determined by a complex of reputation, exclusivity, and resource intensity, has long been the fuel that propels higher education (Marginson, 2006). A review of the organizational dynamics of mature higher educational systems found that family aspirations, the drive for recognition among those within higher education, and state policy promoting excellence all drove a seemingly endless quest for social status in the sector (Cantwell & Marginson, 2018). Simply put, families want excellent students who attend elite programs within top-notch universities, professors want to work in the best departments, and academic leaders and policymakers want to develop world class universities. More recent developments have supercharged academic status-seeking. Since 2003 with the publication of the first Academic Rankings of World Universities (
At the same time as status competition has increased, participation has soared around the world and higher education systems have expanded. One result of high participation and system growth is that systems tend to stratify, with the best-resourced research-focused universities distinguished from their typically newer counterparts that absorb rising demand (Cantwell & Marginson, 2018). The same happens within individual universities. Science and technology units, whose faculty, can secure government and corporate research funding and whose outputs are readily measured by bibliometric indices, tend to enjoy growing status relative to non-science disciplines (Cantwell & Taylor, 2013). An unintended consequence can be the propagation of a belief that some academic units—those in science and engineering disciplines, for example—are higher status on the baseline than others and therefore more excellent (Ordorika & Lloyd, 2015).
If the influence of rankings and growing participation together create social and operational pressure to conduct evaluations, they also present challenges to evaluating effectively. The vast majority of universities and academic units within them are not and are unlikely to become leading research producers. An evaluation designed only to benchmark academic units against the most prestigious units in many cases can be counterproductive. Not only do such endeavors often compare unlike units but they also tend to consider a narrow set of criteria, typically research outputs, that fail to capture much of what happens within higher education. The lore of world-class status is strong, but evaluation in pursuit of that status is in most cases an unproductive activity (Cremonini et al., 2014).
Similarly, high levels of participation tend to increase system stratification (status differentials between institutions) as well as organizational differentiation (status differentials within institutions). The relationship between participation and stratification is solidified by the social reality that not all places within the system have equal value. In most systems, socially advantaged students dominate access to the most desirable places within the system, thereby furthering the gap between the most desirable seats and the rest (Cantwell, Pinheiro, & Kwiek, 2018). Growing participation, thus, creates a set of challenges for evaluation. Ensuring quality is of paramount concern among good faith evaluators. System expansion can lead to the erosion of quality provision, and routine evaluation is one guard against poor quality providers (Harvey, 2013). Further, internationalization of higher education creates a need for recognizable forms of quality assurance (Knight, 2005). Meeting the demands of minting quality in a growing and internationalizing system likely creates pressure for standardized evaluation.
Large and expanding systems must also necessarily accommodate vertical differentiation (diversity of provision) because system scale and complexity demand a heterogeneous set of higher education institutions that have varied goals to meet a comprehensive set of social demands (Pinheiro, Charles & Jones, 2016). Under such conditions, a one-size-fits-all evaluation regime based on assessing status might be unsuitable. Just as with the world-class standard, assessing all academic units within a system against the standard established by the most elite units will not provide meaningful information to govern and improve a more extensive and diverse enterprise. Of course, quality assurance regimes consider more than status, and include measures of quality as defined by those within academic systems as well as external stakeholders (Harvey, 2006). The point here is not to claim all existing quality assurance regiments are exclusively status-based but to offer capabilities-based thinking as a rejoinder to the growing tide of status-based thinking that is well-documented in higher education policymaking and academic leadership (Hazelkorn, 2011).
Rather than starting with a singular definition of excellence, a capabilities-based evaluative processes may begin by asking several questions. The overall principle of a capabilities-based approach is that evaluating academic performance can—and perhaps should—consider how academic units enhance abilities at the individual, organizational, and system-levels to achieve social goals. By taking into consideration the goal of an evaluation, and the local, national, and global context of the unit to be evaluated, one may be able to devise a systematic evaluation regime that is rigorous, appropriate, and useful for informing policy and administrative action. Arguably the most useful evaluations are those that provide tools to enhance the capabilities—broadly defined—of students, academic staff, academic departments, universities, and ultimately national systems.
Considerations
This section identifies and briefly reviews considerations for devising a capabilities-based evaluation system. The topics identified are not intended to be exhaustive, but rather a starting point for analysis. Considerations for a capabilities-based approach include but are not limited to aligning goals and evaluative approaches, social contextual analysis, and unit appraisal within higher education institutions.
Aligning Goals and Approaches
Student learning. Evaluation should have a purpose. What is it that one hopes to evaluate? A direct answer might be student learning. Even so, student learning is a complicated (but not impossible) outcome to measure. For example, evaluating specific skill and concept acquisition is different from evaluating generic skill and capability gains. Moreover, evaluative approaches will vary depending on whether a single course or instructor, an entire program, or even a specific pedagogical approach is the unit of analysis. Because learning occurs in complex organizational and social systems, the unit of analysis question also matters in determining how learning is appraised (Austin, 2011). Evaluation typically involves assessing against a benchmark. This comparative aspect presents further challenges. Students are not assigned randomly to universities and at least in theory are roughly grouped based on academic performance. All things equal, higher performing students, who also tend to come from better-off families, generally enroll in more selective universities whereas lower performing students are more likely to enroll at less selective universities (Marginson, 2018). The same goes for students in different fields. Students in physics programs are, on the average, different than students in arts programs, and therefore their leaning may be difficult to compare directly (Webber, 2014). This reality of systematic student sorting suggests that evaluators should be cognizant of identifying an appropriate reference group.
Measurement of learning also presents a set of complications. Course grades are readily available but may not be the best metric for learning. One problem is that grades may not measure growth. A student who has a preexisting mastery of material might earn a high grade but have learned little. Another problem is that grades and learning may not be tightly linked, especially in national systems that have experienced grade inflation, or a systematic shift upward in the marks students earn in courses (Bachan, 2017; Kostal, Kuncel & Sackett, 2016). A solution might be to measure generic skills, as is done with the
Even when learning and cognitive gains can be measured, it may not fully capture all the gains academic units hope to provide their students. When learning is defined strictly in cognitive terms less attention is given to the psycho-social dimensions of learning. The result may be evaluation that is, be narrower than evaluative goals. In such cases measuring student educational engagement might be appropriate for broadly assessing the quality of learning environments (Coates, 2005). Academic units may also wish to measure the contributions their students make to society in terms of civic participation, individual and public health outcomes, earnings and tax contributions relative to public service usage rates (McMahon, 2009). So, while it is possible to measure leaning and other related outcomes, it can be difficult to measure these concepts well, and evaluators will want to consider exactly what it is they want to measure and think carefully about how to measure it.
Research performance. Research performance is another outcome commonly evaluated. As discussed, rankings are generally not useful tools for evaluation. Ministries, national funding counsels, and university administrators sometimes tie individual researcher and departmental performance evaluations to paper outputs in a pre-defined list of journals, or in journals that appear in specific indices (Geuna & Martin, 2003). When devising such lists, policymakers should carefully consider disciplinary differences and seek to balance excellence with inclusiveness. Similarly, raw bibliometric counts can be blunt tools and should at least be sensitive to field normalizing metrics (e.g., van Leeuwen & Madina, 2012). Cross-disciplinary bibliometric comparisons can be problematic, as can comparing universities with different missions or those occupying asymmetrical market positions. A potential resolution to these challenges is to use research data services that include appropriate comparison groups. Companies like Academic Analytics, for example, claim to offer tools for comparing research performance across a variety of metrics by comparing disciplinary units within university-peer groups. These services are by no means a panacea. Research performance measurement tools can include incomplete data and are subject to minipreparation and misuse (e.g.,
Revenue Generation. The rise of quasi-market market resource allocation (Slaughter & Cantwell, 2012) and new public management (
Contextual Analysis
We have reviewed some questions to consider when approaching common outcomes subject to evaluation. The list presented above is by no means exhaustive. That is to say, elaborating some considerations related to evaluating a limited set of outcomes is not the same as establishing guidelines for determining what outcomes are appropriate to evaluate. Rather than enumerate the expanse of potential outcomes to evaluate, it may be useful to outline a framework for determining which outcomes are most suitable for evaluation within a specific context. Higher education is a social enterprise, embedded in a web of social relationships that spans geographic scale (Marginson & Rhoades, 2002). Some universities are global brands and international hubs that attract students and researchers from abroad, whereas others are tied almost exclusively to stakeholders who operate locally. Building on Marginson (2006) well-known framework that conceptualizes higher education as nested at local, national, and global levels, we can outline a set of guidelines that might help to decide what outcomes are appropriate to evaluate. The local, national, global framework works in two ways. First, the framework may be used to orient evaluation based on whether a university or academic unit’s mission is primarily centered locally, national, or globally. Second, it allows one to consider how evaluative questions shift when attending to the local, national, or global mission of a given university or academic unit.
Global. Research is the currency of global status in higher education. World-class universities are synonymous with globally recognized research. Thus, it is reasonable to expect that research will be a primary dimension for evaluating leading research universities or when determining the global standing of a particular academic unit. Here is where the use of rankings and bibliometric aggregates may be appropriate, especially when considering some aspects of the overall performance of national systems. After all, the now widely influential
Beyond research outputs, global evaluation might include assessment of internationalization. For universities that seek to establish a global educational presence, demand by international students can be one measure of academic program performance. Attracting international students itself may be one measure of program strength but is not sufficient to indicate performance. Retention and positive experiences among students from abroad are indicators of the international relevance and climate of academic programs (Lee, 2010).
Some universities will attract few international students and not every program will host students from aboard, even within universities that draw internationally. For these academic units, globally-minded evaluation might focus on the extent to the which the curriculum is internationalized to prepare students to make sense of the increasingly interconnected world. Here, the concept of “comprehensive internationalization” (Hudzik, 2014) can be helpful. Evaluation adopting a comprehensive approach to internationalization will assume that all units have the potential to enhance global and cross-cultural competencies among students. Comprehensive global evaluation efforts will take a holistic approach, moving beyond traditional markers of global status such as research performance and international mobility, and examine curricular and programmatic relevance and the way in which students and their communities interact with the wider world (Soria & Troisi, 2014). For example, locally-oriented intuitions situated in regions whose economy relies on tourism will want to prepare students to interact with people from outside the community effectively.
National. Evaluation of academic programs should reflect position within the national higher education landscape. Student selectivity, which is attracting students with the best academic credentials, is generally a marker of program status within the national sphere (Marginson, 2006). Given this, it is tempting to assess academic programs by the “quality” of students they attract. However, in most cases evaluation based on incoming credentials will not be a particularly fruitful endeavor, especially in systems where participation is high and expanding (Marginson, 2018). Simply put, when a significant share of the age-cohort participates in higher education, it is unrealistic for most universities and programs to enroll the top performing students. Given this reality, evaluation programs that assess the extent to which academic programs meet social demands will likely be more relevant. Devising evaluation standards that assess the extent to which programs meet the needs of students and their families, along with academic, government, and other relevant stakeholders can provide information useful to guide reform and improvement efforts. Key metrics might include the extent to which a program contributes to student success, and helps students and communities meet their economic and social goals.
No academic program stands alone. Instead, all programs are ultimately a part of national systems and have a role to play in supporting system coherence and stability (Cantwell et al., 2018). Program evaluation assesses system coherence and stability in two ways. First, evaluators should locate the place of a program within a national system. Considerations include national professional or academic standards that apply to a particular program, professional licensure requirements and requirements established by accrediting agencies or program consortia. Second, evaluations should calibrate assessment goals to reflect national position. For example, an academic unit that primarily enrolls undergraduate students might be assessed based on how well it prepares students for employment and graduate study, whereas a unit that hosts an extensive graduate program might be assessed based on the contributions its graduates make to the academic field. Framing program evaluation within the context of national systems can aid assessment systems that benefit individual departments but also contribute to strengthening system coherence.
Local. For some universities, it will be most appropriate to evaluate through a locally-oriented framework. High-powered research and scientific papers published in leading journals can have almost no direct impact locally. Intuitions whose students and faculty are drawn primarily from the local community and whose research profile is modest may be best suited to embrace rather than shun their regional commitments (Chatterton & Goddard, 2000). When assessing a program with a local orientation, one might consider how the program shapes access, educational capacity and opportunity in the region. One way to approach a locally-guided orientation is to assess the extent to which the curriculum is relevant to students and communities (Ladson-Billings, 1995). Do programs, for example, support the culturally, social, and economic development of students and communities? This is not to suggest that abstract or global knowledge is irrelevant, but rather that it is reasonable to consider the extent to which such information is integrated into a curriculum that is responsive to the lives of the students it serves. Even nationally prominent and globally active universities might take local considerations into account when determining how to evaluate academic programs. Examples of locally-minded questions include the extent to which links with local stakeholders are established, the economic contributions made to communities, and efforts to engage in outreach and service locally.
Academic Units within the University Context
Just as institutions are situated differently in global, national, and local contexts, academic units are situated differently within institutions. It means something different to be an arts program in a specialized technical intuition than a comprehensive university. Evaluation efforts should attend to department or program intergradation within an intuition.
Mission and scope sensitivity. One consideration in conducting an evaluation is the extent to which a department or program contributes to the mission and scope of a university. Some programs will be central to the institutional mission. Other programs contribute by extending the scope of programmatic offerings. Evaluation should be sensitive to the different ways units contribute to the university mission, and different evaluative standards may be necessary for different units within the same university (Birnbaum, 1989). For example, it would be inappropriate to evaluate graduate and undergraduate programs the same way. Undergraduate program evaluation might wish to focus on student learning and engagement whereas graduate programs might attend more closely to research and global engagement.
Supporting strengths. Another consideration for evaluation is how a department or program supports intuitional strengths. In some cases, support will be apparent. Basic mathematics departments, for example, are likely foundations for universities that specialize in science and engineering. Evaluation of the mathematics department, therefore, might consider the extent to which the math curriculum is preparing students for subsequent specialized courses in their discipline of choice. Less obvious is how the arts or social sciences will contribute to the strengths of the same university. However, evaluators may wish to be cautious in determining that such units are irrelevant to institutional mission. Communication, writing, critical thinking, and problem-solving skills developed in the arts and social sciences apply to all aspects of academic work (Delbanco, 2014).
Supporting students. Finally, supporting students is central for all units within an institution. Students are among the most critical institutional constituents, and their success within individual departments and programs is foundational to institutional success. Evaluative efforts should seek to identify the ways units support students, as well as the ways a unit may not be effectively supporting students. By placing student support central to evaluation, the assessment will be sensitive to how units interact with students and how those interactions fit within the overall structure of the university (Kuh et al., 2005).
Towards a Capabilities-Based Approach to Evaluation
Thus far we have reviewed a set of considerations that may be useful for those evaluating academic units within a university. A common theme among these considerations is that evaluation is most useful when outcome goals are specified and when the assessment is appropriately reflective of context. To put it another way, the argument underlying the review of considerations for capabilities-based evaluation requires both specificity and flexibility and should not be subordinated to a single narrative. Perhaps above all, evaluation should strengthen the capability of students, academic units, and universities to contribute in complex and changing societies. In what remains, the broad outlines of a capabilities-based approach to evaluation are developed further.
To some extent, a capabilities-based approach to assessment is ambivalent about standard “excellence” based models that often rely on status metrics for assessment. The question is inventible: who objects to excellence (Readings, 1996)? The point here is not to strive for mediocrity, but rather to acknowledge that excellence is often defined narrowly, in ways that make top performance unrealistic for most departments and pit the universities (and departments therein) intractably in competition for status and resources.
A capabilities-based approach to evaluation privileges outcomes that enhance individual, organizational, and community capacity. Andrew Delbanco (2014), an American humanist, has argued that a higher education ought to instill in students with:
“a skeptical discontent with the present, informed by a sense of the past;” “the ability to make connections among seeming disparate phenomena;”
“appreciation of the natural world, enhanced by knowledge of science and the arts;” “a willingness to imagine experience from perspectives other than one’s own;”
and “a sense of ethical responsibility.”
The point is not that Delbanco’s list is singularly authoritative or exhaustive. What higher education should impart to students will vary from place to place. Delbanco’s list is useful because it identifies the kinds of outcomes that prepare students to be capable and responsible family and community members who also possess attributes for individual success. It is not the list itself but the logic behind the list that is generative for devising capabilities-based evaluative systems.
Extending this idea, a capabilities-based approach to evaluating academic programs may entail two broad goals. The first is to enhance students’ capacity to chart a successful and satisfying life for themselves that also contributes to family and community wellbeing. A capabilities-based approach to evaluation is interested in the ways academic units bolster individual ability to contribute to the common good. The second goal is to strengthen institutions to build strong universities and higher education systems that meet social demands. Higher education systems are both constitutive and reflective of the societies in which they exist. Evaluation can support academic units—the building blocks of higher education systems—to support social goals and reflect social aspirations.
It may be difficult to know if evaluative regimes are consistent with a capabilities-based approach. To provide a frame of reference, three guideposts for a capabilities-based are briefly summarized, each of which is found in several the considerations reviewed above: (a) strengthening systems to meet social demands; (b) preparing students to live full and capable lives; (c) supporting community aspirations.
Strengthening systems to meet social demands. Large and complex systems of higher education are subject to complex multi-level governance and accountability mechanisms in which multiple stakeholders make various demands of the system (Cantwell et al., 2018). A small share of academic units in big systems will be world-leading research powerhouses. Most will contribute to meeting national and local concerns. In such systems demand is underwritten by the social aspirations of students and their families. A capabilities-based approach will focus on how academic units satisfy the demands of system stakeholders and provide an opportunity for students and families to express their aspirations through educational attainment.
Preparing students to live full and capable lives. It is not possible to deny the role higher education plays in workforce development. At the same time, when systems massify not all graduates will join the ranks of elite professional employment in business, the civil service, or traditional professions (Marginson, 2018). Rather than attending exclusively to employment outcomes like graduate salaries, a capabilities-based approach to evaluation calls for assessing academic units based on their capacity to prepare students for full, satisfying lives. Contemporary society is complicated, and the capacity to successfully navigate the social-world is itself an important outcome to education.
Supporting community aspirations. All communities, whether defined at the family, regional, or national level, aspire for stability, prosperity, and a say in their own future. Higher education is one common intuitional form to which communities turn to realize these aspirations. In the broadest sense, academic evaluation ought to help university leaders design institutions that support community aspirations. Higher education programs that are held to account for the common good are more likely, all else equal, to deliver tangible benefits that extend beyond individual participants.
Conclusion
The purpose of this article is to introduce the idea of capabilities-based evaluation of academic units. A capabilities-based approach stresses the contributions academic units make to enhancing a social actor’s ability to support individual and social aspirations. The central idea is that evaluation programs need not measure the same outcomes for all academic units, but can instead evaluate the extent to which a unit contributes to the broad goals of higher education. In this way, academic units should be evaluated both within their local, national, and global contexts and in congruence with their specific place within higher education institutions. The preceding analysis examined a set of considerations one might weigh when crafting an evaluation using a capabilities-based approach and attempted to provide some framing for determining if an evaluative process is consistent with a capabilities-based approach. Technical procedures are not described nor does the article attempt to review or comment upon all existing forms of evaluations and quality assurance.
As higher education systems grow and become more complex policymakers and academic leaders will have to grapple with heterogeneity both within and between institutions. Evaluations can provide scores—sometimes with precision and reliability—about performance on this metric or that. Heterogeneity presents challenges to evaluation because it is harder to determine if two units are sufficiently alike to be compared. The sort of evaluation that offers assessment of relative status will be useful in some instances. There are limits to such approaches, however, and one limitation is that they tend to assess all academic units against a single standard. A capabilities-based approach to evaluation prioritizes assessing the social contributions of academic units, adjusted for the specific contexts in which they operate. The overall purpose of such an approach is to build adaptable higher education systems that support the realization of social aspirations.
