How has healthcare research performance been assessed? A systematic review

Abstract

Objectives

Healthcare research performance is increasingly assessed through research indicators. We performed a systematic review to identify the indicators that have been used to measure healthcare research performance. We evaluated their feasibility, validity, reliability and acceptability; and finally assessed the utility of these indicators in terms of measuring performance in individuals, specialties, institutions and countries.

Design

A systematic review was performed by searching EMBASE, PsycINFO, Ovid MEDLINE and Cochrane Library databases between 1950 and September 2010.

Setting

Studies of healthcare research were appraised. Healthcare was defined as the prevention, treatment and management of illness and the preservation of mental and physical wellbeing through the services offered by the medical and allied health professions.

Participants

All original studies that evaluated research performance indicators in healthcare were included.

Main outcome measures

Healthcare research indicators, data sources, study characteristics, results and limitations for each study were studied.

Results

The most common research performance indicators identified in 50 studies were: number of publications (n = 38), number of citations (n = 27), Impact Factor (n = 15), research funding (n = 10), degree of co-authorship (n = 9), and h index (n = 5). There was limited investigation of feasibility, validity, reliability and acceptability, although the utility of these indicators was adequately described.

Conclusion

Currently, there is only limited evidence to assess the value of healthcare research performance indicators. Further studies are required to define the application of these indicators through a balanced approach for quality and innovation. The ultimate aim of utilizing healthcare research indicators is to create a culture of measuring research performance to support the translation of research into greater societal and economic impact.

Introduction

Academic healthcare is the synergy between studying disease mechanisms, identifying new treatments, improving patient care and training healthcare professionals.^1–3 Although the contribution of research to healthcare over the last century has been remarkable, academic healthcare often endures the inequality and lack of transition between basic and clinical research, fails to drive technology and innovation in clinical practice, underrates the role of education, and disregards social and global accountability.^1,3,4

To tackle these deficits, a system is required for academic healthcare researchers to measure research performance according to an accepted global benchmark, so that innovation and quality of research can be improved and new discoveries can be translated into medical advances.⁵ Currently, systems such as The Research Assessment Exercise (UK) and Institutional Assessment Framework (Australia) have attempted to appraise academic research in general.^6,7 The Research Excellence Framework is a new development to assess the quality of research in UK higher education institutions,⁸ although currently there are no validated systems to accurately measure performance in healthcare research. This has been difficult to implement due to the operational complexity of the discipline (Figure 1).⁵ To design a system that can successfully measure healthcare research performance it is imperative to determine which indicators can measure this more accurately.

Figure 1

Elements of academic healthcare performance⁵ (in colour online)

The objectives of this article are to: (1) identify existing indicators which specifically assess healthcare research performance; (2) assess each indicator to determine its feasibility, validity, reliability and acceptability; and (3) evaluate the utility of each indicator in terms of individual, specialty, institutional and global perspective.

Methods

This study was performed following guidelines from the preferred reporting items for systematic reviews and meta-analyses (PRISMA).⁹

Data sources and searches

Studies to be included in the review were identified by searching the following databases: (1) EMBASE (1980–September 2010), (2) PsycINFO (1967–September 2010), (3) Ovid MEDLINE (1950–September 2010), and (4) Cochrane Library.

All databases were searched using the following free text search: ‘academic OR university OR education OR scientific OR institution’ AND ‘performance OR competence OR quality OR productivity’ AND ‘assessment OR evaluation OR indicator OR peer review’ AND ‘index OR bibliometric OR impact factor OR citation OR benchmark’ AND ‘health care OR medicine OR surgery OR physician OR biomedical OR hospital OR scientist’. The search was expanded by using all possible suffix variations of the keywords. Additional studies were identified by searching the bibliographies of the studies that had been identified through the electronic search. A keyword search was chosen rather than Medical Subject Headings (MeSH), because there was a lack of established MeSH terms in this area of research.

Study selection

We included all original studies that evaluated research performance indicators which measured performance across individuals, specialties, institutions and countries in healthcare. For this study, healthcare was defined as the prevention, treatment, and management of illness and the preservation of mental and physical wellbeing through the services offered by the medical and allied health professions.¹⁰ There were no language restrictions. We excluded all studies that did not have data relevant to healthcare.

Two authors (VP and HA) independently reviewed the titles and abstracts of the retrieved articles, and selected publications to be included in this review. The full texts of these publications were reviewed by the two authors, who selected the relevant articles for inclusion in the review. When there was disagreement, a third author (SA) was consulted and a decision was made by agreement of all authors.

Data extraction and quality assessment

Two authors (VP and HA) independently extracted data from the full text, which included source of article, study design, study period, type of performance indicator, data source, study population and their sample size, type of statistical analysis, outcomes and methodological limitations. Disagreements in data extraction were resolved by discussion and consensus between all authors. Study quality was assessed using the Oxford Centre for Evidence-based Medicine Levels of Evidence classification.¹¹

Data synthesis and analysis

The methodology of the included studies was heterogeneous, therefore it was not possible to pool data and statistically analyse the results. The indicators that were identified were analysed in terms of their: (1) utility (the usefulness of indictors at individual, specialty, institutional and global levels); (2) feasibility (measure of whether the indicator is capable of being used); (3) validity (measure of the relevance of the indicator: content, convergent and discriminant validity); (4) reliability (measure of the reproducibility or consistency of an indicator); and (5) acceptability (the extent to which the indicator is accepted by researchers).^12,13

Results

Study selection

We retrieved 6705 potentially relevant articles, of which 1185 duplicate articles were identified and excluded. Of the remaining articles, 5385 were excluded after title and abstract review. Review of the full text and bibliography of the remaining 135 articles identified 50 studies for inclusion in the review (Figure 2) (Table 1 – see http://jrsm.rsmjournals.com./cgi/content/full/104/6/251/DC1). The agreement for inclusion of the studies between the authors was satisfactory (κ = 0.86, P < 0.001).

Figure 2

Selection of articles for the systematic review

Study characteristics

All evidence was level 4 according to the Oxford Centre for Evidence-based Medicine.¹¹

The plurality of studies were performed in North America (n = 20)^14–33 and Western Europe (n = 19).^34–52 Fewer studies were performed in Eastern Europe (n = 5),^53–57 South America (n = 3),^58–60 Asia (n = 2)^61,62 and Australia (n = 1).⁶³ The studies were published from 1973 until 2009, but the majority of the studies were published after the millennium (n = 34).^{17,19,24,27,28,31–36,38–40,42–47,49–51,53–63} The design of each study was retrospective and observational.

Forty-two studies used Thomson Scientific's Institute for Scientific Information database (ISI).^{14–18,20–30,32–34,36,37,39–46,49,51,53–56,58–63} Out of these, 10 studies used one additional database: Scopus (n = 2),^40,63 MEDLINE (n = 5),^{24,28,44,52,62} PsycINFO (n = 1),²⁶ National Institutes of Health (NIH) (n = 1)²⁷ and institutional (n = 1);⁵⁸ three studies used two additional databases: MEDLINE and PsycINFO (n = 1),⁵⁴ EMBASE AND MEDLINE (n = 1)⁵¹ and PsycINFO and NIH (n = 1).¹⁷ Out of the studies that did not use ISI, four studies used one database: MEDLINE (n = 2)^35,47 and institutional (n = 2);^49,57 four studies used two databases: institutional and MEDLINE (n = 1),³⁸ Scopus and Spanish Office of Patents and Trademarks (n = 1),⁵⁰ and NIH and MEDLINE (n = 1),¹⁹ Scopus and Google (n = 1).³¹

Only seven studies assessed research performance over a lifetime^{31,40,52,54,58–60} in comparison to 24 studies assessing research performance over a 1–5-year period.^{15,16,19–23,27–30,35,36,43,45, 47–51,55,57,61,62}

The main methodological limitation was the use of a single bibliometric database as the only information source in 32 studies.^{14–16,18,20–23,25,27,29,30,32–37,39,41–43,45–47,49,51–53,55,56,59–61}

Type of indicators

The types of indicator that were used to measure research performance in each study included number of publications (n = 38),^{14,16–30,32–40,42–44,47,48,51–57,60} number of citations (n = 27),^{14–17,20–23,26–28,30,32–34,36,37,39,41–43,45,51,52,55,57,63} Impact Factor (n = 15),^{20,24,28,35,38,42,44,46,48,49,52,54,56,61,62} research funding (n = 10),^{17–19,27,29,30,32,33,35,56} degree of co-authorship (n = 9),^{20,31,37,38,41,49,52,56,57} population size (n = 6),^{24,33,40,44,49,63} gross domestic product (n = 5),^{24,33,40,44,49} h index (n = 5),^{27,31,58–60} peer review (n = 6),^{32,34,35,43,51,52} g index (n = 1),³¹ age-weighted citation ratio (AWCR) (n = 1),³¹ number of conference presentations (n = 1),²⁸ number of patents (n = 1),⁵⁰ number of doctoral students (n = 1),¹⁷ number of editorial responsibilities (n = 1)¹⁷ and gender (n = 2).^28,52 Twelve studies evaluated one indicator only,^{15,25,41,45–47,50,53,58,59,61,62} whereas 16 studies evaluated two indicators,^{14,18,19,23,29,36,39,48,54, 55,60,63} seven studies evaluated three indicators,^{30,34,37,38,40,43,57} nine studies evaluated four indicators^{20,27,31,32,34,35,42,44,56} and four studies evaluated five indicators.^17,28,33,49

Number of publications

The simplest measure of research productivity in healthcare is the number of published articles a researcher or group of researchers produce within a time span.^{14,16–30,32–44,47–49,51–57,60,63} This indicator can be presented by document type so that letters, editorials, reviews and conference papers can be excluded.⁴⁷ It is relatively easy to calculate using bibliometric databases such as ISI, MEDLINE and Scopus, but these databases will ignore non-journal publications. It can be difficult to retrieve all the publications for certain researchers because of the commonality of names.¹⁸ The number of publications does not take into account the size of the research group, the type of research or the quality of the publication. To address this problem, publications per author, population size or publications in top ranked journals can be considered.^47,49 Although the number of publications is commonly used to measure research performance in individuals, specialties, institutions and countries, often as a benchmark, there are no studies formally validating this indicator in healthcare. However, a few studies have shown significant correlation between the number of publications and other measures of research performance, such as citations, peer review and research funding.^{19,30,32,34,35}

Number of citations

The impact of healthcare research can be measured by counting the number of citations received by a researcher or group of researchers from published articles within a time span.^{14–17,20–23,26–28,30,32–34,36,37,39,41–43,45,51,52,55,57,63} Bibliometric databases such as ISI, Scopus and Google Scholar are required to extract citation counts, which are subject to error because the databases are affected by commonality of names, typographical errors, variation of literature sources and geographical bias.^45,63 Citation analysis assumes that there is a positive association between the citing and referenced article, which does not account for articles that can be cited for negative impact. Citation counts are typically higher in older articles, falsely elevated by self-citations, and can vary between document type and speciality.^36,45 In order to make comparisons across specialties relative citation factor can be used to normalize citation counts.^36,45 As well as specialties, the number of citations has been used to measure research performance in individuals, institutions and countries but there are no studies formally validating this indicator in healthcare. One study, with a small sample size, has demonstrated a low correlation between number of publications and citations.²² However, the majority of studies have shown significant correlation between the number of citations and other measures of research performance, such as publications, co-authorship, peer review and research funding.

Impact Factor (IF)

The Journal Impact Factor is calculated by dividing the number of current year citations to the source items published in that journal during the previous two years.⁵⁵ It is an evaluation tool provided by ISI Thomson Reuters Journal Citation Reports® which is used to measure the scientific impact of journals.⁵⁶ Evaluating research performance using IF can have a marked affect on performance rankings.^{20,24,35,38,42,46,48,54,56,61,62} However, the IF is influenced by publication language, document type, citation patterns, open access journals, fast track publications and co-authorship, as well as disregarding publications from zero impact journals.⁶¹ More importantly, there is large IF variation between healthcare specialties. For this reason, IF may not reflect quality of research performance, but instead the different publication and citation patterns within specialties.⁶¹ Normalizing the IF can provide a more realistic assessment of research quality, which has been demonstrated at an institutional level.⁶²

H index

The h index of a researcher is the number of ‘h’ publications with at least ‘h’ citations each during a time span.⁶⁴ Initially the h index was introduced in physics to address the limitations of publication number, which does not account for research quality, and citation number, which can be disproportionately influenced by a small number of highly cited papers.⁶⁴ The h index simultaneously evaluates the quality and sustainability of research productivity,⁶⁴ and can be calculated without difficulty by bibliometric databases such as ISI, Scopus and Google Scholar. In healthcare, the h index has been shown to be a useful statistic to evaluate a researcher's contribution within a given specialty and may even be helpful as a promotional tool.^{27,31,58–60} General drawbacks of bibliometrics, such as commonality of names and publication language are shared by the h index, which is also positively biased to senior researchers with older publications.^31,58,59

Indicators such as the g index and Age Weighted Citation Ratio (AWCR) have been proposed to address these limitations, but there is strong correlation between both of these measures and the h index.³¹ In addition, the h index has been shown to overcome the disadvantages of multiple authorship and self-citation.³¹ There is consensus that the h index cannot be used to measure research performance between different specialties because of diverse publication and citation practices.^{27,31,58–60}

Research funding

Research funding is a term covering any financial support for scientific research. This indicator poses an analytical problem, because it is an example of circular cause and effect. Based on bibliometrics, it is difficult to differentiate whether more research funding improves a researcher's performance or if superior performing researchers receive more research funding. Regardless, most of studies show significant correlation between research funding and research performance at an individual and institutional level.^{17,19,27,30,32,35,56} Developed countries with higher research spend also have higher research productivity.^18,33

Degree of co-authorship

Co-authorship determines the extent a researcher or research group collaborates with others to publish articles. Authors can collaborate at an international, institutional, departmental or individual level. In healthcare several studies have demonstrated that research performance is improved with international collaboration.^42,49,56 The role of co-authorship at an organizational level has been shown to have a positive impact on performance and has been considered as a novel evaluation tool.^37,38 However, the role of co-authorship at an individual level is uncertain, but indicators such as the h index overcome this potential limitation.^20,31,57

GDP and population size

GDP is a measure of a country's overall economic output and population size is the number of individuals in a region. Adjusting research performance indicators for GDP and population size allows fairer comparison of global performance.^24,33,44 However, GDP and population size may also be markers of performance in their own right.^40,49,63

Uncommon indicators

It is difficult to quantify the value of indicators such as peer review, number of conference presentations, number of patents, number of doctoral students, number of editorial responsibilities, and gender because of limited research in these areas.^{17,28,32,34,35,43,50–52}

Feasibility, validity, reliability and acceptability

Feasibility of using publications, grants, doctoral students and editorial responsibilities to measure research performance was assessed by a survey in one study.¹⁷ The respondents generally agreed with the use of these four indicators. Seven studies measured convergent validity by correlating number of publications with number of citations.^{16,21,23,26,30,37,52} One study demonstrated significant reliability of textbook citations to measure research performance (P < 0.001).²⁶ No other studies assessed research performance indicators in terms of feasibility, validity, reliability and acceptability.

Utility

Twenty-one studies compared research performance between individuals^{14,19–22,25,28,30–32,34,35,38,46,51–55,57,60} and 14 between specialties.^{15,16,20,21,29,31,36,40–42,45,46,59,61} All individuals were researchers in a range of healthcare specialties, and the most common specialties were medicine in general (14 studies)^{15,18–20,27,35,36,38–40,46,47,56,58} and psychology (11 studies).^{16,21–23,25,26,37,41,43,54} Eleven studies compared research performance between institutions, which included universities, national academies and hospitals in the USA, UK, Canada, Australia, New Zealand, France, Germany, Italy, Switzerland, Finland, Serbia, Croatia, Romania, Brazil and Iran.^{23,25,27,37,42,48,53,54,56,59,62} Thirteen studies compared research performance between countries, of which nine studies assessed performance globally and three studies assessed performance of the USA with the UK, Europe and Brazil.^{15,18,24,33,39,40,43,44,47,48,50,59,63}

Discussion

This is the first systematic review which identifies indicators for assessment of research performance in healthcare. The most widely used indicators include bibliometrics such as number of publications, number of citations and IF, h index, g index and AWCR. Less commonly used indicators include degree of co-authorship, number of conference presentations, number of patents, research funding, number of doctoral students, number of editorial responsibilities, peer review, gender, gross domestic product and population size. The utility of these indicators in assessing research performance in individuals, specialties, institutions and countries has been well described, but their feasibility, validity, reliability and acceptability has not been formally evaluated.

Measuring the number of publications and their citations are simple ways to signify influence. Although they are the most commonly used methods, it is hard to compare them among specialties or career stages. However, this shortcoming can be overcome by normalizing these indicators to scientific disciplines and experience at both individual and institutional levels.

The h index considers both the research productivity and its impact, although its use is limited by variations in individuals' age and their discipline. Several other variants of the h index have been developed to address these drawbacks, for instance the g-index which provides higher scores for increased numbers of citations.

The journal IF should be cautiously used, preferably as an adjunct to other methods, this is because it only considers the impact of journals and does not assess the performance of individual researchers or the impact of their publications.⁶⁵

Research funding and degree of co-authorship can be used in addition to the above mentioned indicators to measure individual, specialty and institutional performance. When measuring performance at global level, GDP and population size should be added to the performance assessment metrics.

Bibliometric research outputs are readily accessible from databases such as ISI, Scopus and Google Scholar. The methods of extracting these outputs should be transparent in all databases so that researchers are able to make an informed decision on the sources of their performance statistics (Table 2). A universally accepted framework needs agreement by the decision-makers in academia to standardize research outputs, so that the economic and societal impact of research can be measured. A recent example includes the STAR METRICS working group in the United States (Science and Technology in America's Reinvestment – Measuring the EffecT of Research on Innovation, Competitiveness and Science) who are developing a common empirical infrastructure.⁶⁶

Table 2

Characteristics and differences between the citation indexing databases^74–78

	Web of Science	Scopus	Google Scholar
Date of inauguration	Since early 1960s, but accessible via internet in 2004	11/2004	11/2004
Journals (n)	10,969	16,500 (>1200 open access journals)	Not revealed (theoretically all electronic resources)
Language	English (plus 45 other languages)	English (plus more than 30 other languages)	English (plus any language)
Subject coverage	Science, social science, and arts and humanities	Science and social science	Not revealed
Period covered	1900–present	1966– present	Not revealed
Updating	Weekly	1–2/week	Monthly
Developer	Thompson Scientific (US)	Elsevier (Netherlands)	Google Inc. (US)
Fee-based	Yes	Yes	No
H index calculation	Yes	Yes	Only using Harzing's Publish or Perish software

There are several limitations at a study and review level. Studies will be biased when authors evaluate their own performance or the performance of their affiliated specialties, institutions or countries. There is different coverage of peer-reviewed publications between bibliometric databases, so a source level bias will exist in studies which use a single data source. This systematic review was limited by the poor quality of the studies. In addition, meta-analysis could not be performed because of the diversity of the studies which did not have homogeneous methods or results.

This study has several implications: (1) further studies are needed to determine the feasibility, validity, reliability and acceptability of current and future research performance indicators; (2) specifically, it is important to assess the value of the h index because it measures the importance, broad impact consistency and sustainability of a scientist's research; (3) co-authorship networks and changes in collaboration patterns over time should be analysed to establish whether they are important tools to assess and develop research performance; (4) the use of the IF to evaluate a researcher's performance needs to be investigated, since the IF has only been designed to measure journal performance; (5) researchers and policymakers can then debate what role the indicators should play, both in terms of the weighting and the level they should be incorporated into the decision-making process; (6) the balanced scorecard is a performance measurement framework that adds strategic non-financial performance measures to traditional financial metrics (Figure 3).⁶⁷ Although designed for business and industry, the balanced scorecard can be modified for non-profit and non-manufacturing research institutions.⁶⁸ This approach needs to be adapted by institutions to present a more unbiased view of research performance. This multifaceted method of research performance evaluation will require a multidimensional model of analysis utilizing a broad range of robust analytical techniques;⁶⁹ (7) enhanced healthcare research indices should be translated into improved healthcare outcomes because the principal aim of healthcare research is to improve patient wellbeing. It is now imperative to consider healthcare outcomes as opposed to research outputs. The use of healthcare outcomes can then determine important factors such as the societal and economic impact of healthcare research, in addition to awards of clinical prestige and quality. ^70–73,79

Figure 3

Balanced scorecard showing performance areas of an organization⁶⁷ (in colour online)

Conclusions

Recently, there has been greater awareness of the importance of research performance indicators in healthcare. As a result the prevalence and usage of metrics such as number of publications, number of citations, IF and h index has increased. However, the assessment of feasibility, validity, reliability and acceptability of these indicators has been poorly investigated. Future studies are required to improve the current standards and accuracy of performance evaluation. It is imperative to have a balanced approach when measuring research performance in healthcare, which should consider quality and innovation. There is an increased need to consider the role of healthcare research outcomes in achieving societal and economic impact. The ultimate aim is to accurately quantify the research performance of healthcare individuals and institutions to cultivate an environment which can support translational medicine to improve the quality of patient care.

DECLARATIONS

Competing interests

None declared

Funding

None

Ethical approval

Not applicable

Guarantor

VMP

Contributorship

All authors contributed equally

Acknowledgements

None

References

Ioannidis

. Academic medicine: the evidence base. BMJ 2004;329:789–92

Kanter

. What is academic medicine? Academic Medicine 2008;8:205–6

Wilkinson

. ICRAM (the International Campaign to Revitalise Academic Medicine): agenda setting. BMJ 2004;329:787–9

Awasthi

, Beardmore

, Clark

, The Future of Academic Medicine: Five Scenarios to 2025 . New York, NY: Milbank Memorial Fund, 2005. See http://www.milbank.org/reports/0507FiveFutures/0507FiveFutures.html

Ashrafian

, Rao

, Darzi

, Athanasiou

. Benchmarking in surgical research. Lancet 2009;374:1045–7

RAE. Research Assessment Excercise. See http://www.rae.ac.uk/

IAF. Institutional Assessment Framework. See http://www.deewr.gov.au/HigherEducation/Resources/Pages/InstPerfPortfolios.aspx .

REF. Research Excellence Framework. See http://www.hefce.ac.uk/Research/ref/

Liberati

, Altman

, Tetzlaff

, The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ 2009;339:b2700

10.

AHMD. The American Heritage^® Medical Dictionary. 2010

11.

EBM. Oxford Centre for Evidence-based Medicine. See www.cebm.net/levels_of_evidence

12.

Ahmed

, Ashrafian

, Hanna

, Darzi

, Athanasiou

. Assessment of specialists in cardiovascular practice. Nat Rev Cardiol 2009;6:659–67

13.

Wass

, Van der Vleuten

, Shatzer

, Jones

. Assessment of clinical competence. Lancet 2001;357:945–9

14.

Armstrong

, Caverson

, Adams

, Taylor

, Olley

. Evaluation of the Heart and Stroke Foundation of Canada Research Scholarship Program: research productivity and impact. Can J Cardiol 1997;13:507–16

15.

Brown

, Coward

, Stowe

. Assessing the scientific strength of Chile. Arch Biol Med Exp 1991;24:37–47

16.

Buss

. Evaluation of Canadian psychology departments based upon citation and publication counts. Canadian Psychological Review/Psychologie canadienne 1976;17:143–50

17.

Byrnes

, McNamara

. Evaluating doctoral programs in the developmental sciences. Developmental Review 2001;21:326–54

18.

Divic

. Survey of Croatian science from Science Citation Index between 1985 and 1992 scientometrics – one possible approach. Periodicum Biologorum 1994;96:187–96

19.

Druss

, Marcus

. Tracking publication outcomes of National Institutes of Health grants. Am J Med 2005;118:658–63

20.

Ellwein

, Khachab

, Waldman

. Assessing research productivity: evaluating journal publication across academic departments. Acad Med 1989;64:319–25

21.

Endler

. Research productivity and scholarly impact of Canadian psychology departments. Canadian Psychological Review/Psychologie canadienne 1977;18:152–68

22.

Endler

. Where the “stars” are: The 25 most cited psychologists in Canada (1972–1976). Canadian Psychological Review/Psychologie canadienne 1979;20:12–21

23.

Endler

, Rushton

, Roediger

. Productivity and scholarly impact (citations) of British, Canadian, and U.S. departments of psychology (1975). American Psychologist 1978;33:1064–77

24.

Falagas

, Papastamataki

, Bliziotis

. A bibliometric analysis of research productivity in parasitology by different world regions during a 9-year period (1995–2003). BMC Infectious Diseases 2006;2334:56

25.

Gordon

. Research productivity in master's-level psychology programs. Professional Psychology: Research and Practice 1990;21:33–6

26.

Gordon

, Vicari

. Eminence in social psychology: A comparison of textbook citation, Social Sciences Citation Index, and research productivity rankings. Personality and Social Psychology Bulletin 1992;18:26–38

27.

Hendrix

. An analysis of bibliometric indicators, National Institutes of Health funding, and faculty size at Association of American Medical Colleges medical schools, 1997–2007. J Med Library Association 2008;96:324–34

28.

Housri

, Cheung

, Koniaris

, Zimmers

. Scientific impact of women in academic surgery. J Surg Res 2008;148:13–16

29.

Kaplan

, Mysiw

, Pease

. Academic productivity in physical medicine and rehabilitation. Am J Phys Med Rehabil 1992;71:81–5

30.

Koren

, Barker

, Mitchell

, Abramowitch

, Strofolino

, Buchwald

. Patient-based research in a tertiary pediatric centre: A pilot study of markers of scientific activity and productivity. Clin Invest Med 1997;20:354–8

31.

Lee

, Kraus

, Couldwell

. Use of the h index in neurosurgery. Clinical article. J Neurosurg 2009;111:387–92

32.

Lichtman

, Oakes

. The productivity and impact of The Leukemia & Lymphoma Society scholar program: The apparent positive effect of peer review. Blood Cells Mol Dis 2001;27:1020–7

33.

Soteriades

, Falagas

. Comparison of amount of biomedical research originating from the European Union and the United States. BMJ 2005;331:192–4

34.

Bornmann

, Wallon

, Ledin

. Does the committee peer review select the best applicants for funding? An investigation of the selection process for two European molecular biology organization programmes. PLoS ONE 2008;3:e3480

35.

Bovier

, Guillain

, Perneger

. Productivity of medical research in Switzerland. J Investig Med 2001;49:77–84

36.

Castellano

, Radicchi

. On the fairness of using relative indicators for comparing citation performance in different disciplines. Arch Immunol Ther Exp (Warsz) 2009;57:85–90

37.

Colman

, Grant

, Henderson

. Performance of British university psychology departments as measured by number of publications in BPS journals. Win, 1992–1993. Current Psychology 1992;11:362–71

38.

Devos

, Lefranc

, Dufresne

, Beuscart

. From bibliometric analysis to research policy: the use of SIGAPS in Lille University Hospital. Stud Health Technol Inform 2006;124:543–8

39.

Fava

, Guidi

, Sonino

. How citation analysis can monitor the progress of research in clinical medicine. Psychother Psychosom 2004;73:331–3

40.

Groneberg-Kloft

, Scutaru

, Kreiter

, Kolzow

, Fischer

, Quarcoo

. Institutional operating figures in basic and applied sciences: Scientometric analysis of quantitative output benchmarking. Health Res Policy Syst 2008;6:6

41.

Innes

. The utility of a citation index as a measure of research ability in psychology. Bull Br Psychol Soc 1973;26:227–8

42.

Koskinen

, Isohanni

, Paajala

, How to use bibliometric methods in evaluation of scientific research? An example from Finnish schizophrenia research. Nordic J Psychiatry 2008;62:136–43

43.

Lewison

, Thornicroft

, Szmukler

, Tansella

. Fair assessment of the merits of psychiatric research. Br J Psychiatry 2007;190:314–18

44.

Michalopoulos

, Falagas

. A bibliometric analysis of global research production in respiratory medicine. Chest 2005;128:3993–8

45.

Radicchi

, Fortunato

, Castellano

. Universality of citation distributions: Toward an objective measure of scientific impact. Proc Natl Acad Sci U S A 2008;105:17268–72

46.

Rostami-Hodjegan

, Tucker

. Journal impact factors: a 'bioequivalence' issue? Br J Clin Pharmacol 2001;51:111–17

47.

Tutarel

. Geographical distribution of publications in the field of medical education. BMC Med Educ 2002;2:1472–6920

48.

Ugolini

, Bogliolo

, Parodi

, Casilli

, Santi

. Assessing research productivity in an oncology research institute: The role of the documentation center. Bull Med Libr Assoc 1997;85:33–8

49.

Ugolini

, Casilli

, Mela

. Assessing oncological productivity: Is one method sufficient? Eur J Cancer 2002;38:1121–5

50.

Campos Jimenez

, Campos Ferrer

. Technological analysis of immunology in Spain, 2004–2005. Inmunologia 2007;26:55–61

51.

Bornmann

, Daniel

. Selecting scientific excellence through committee peer review – A citation analysis of publications previously published to approval or rejection of post-doctoral research fellowship applicants. Scientometrics 2006;68:427–40

52.

Wenneras

, Wold

. Nepotism and sexism in peer-review. Nature 1997;387:341–3

53.

Brkic

. The impact of biomedical literature published in the province of Vojvodina on researchers in the world and in Yugoslavia. Medicinski Pregled 2001;54:21–33

54.

David

, Moore

, Domuta

. Romanian psychology on the international psychological scene: A preliminary critical and empirical appraisal. European Psychologist 2002;7:153–60

55.

Jokic

. Scientometric evaluation of the projects in biology funded by the Ministry of Science and Technology, Republic of Croatia, in the 1991–1996 period. Periodicum Biologorum 2000;102:129–42

56.

Puljak

, Vukojevic

, Kojundzic

, Sapunar

. Assessing clinical and life sciences performance of research institutions in Split, Croatia, 2000–2006. Croat Med J 2008;49:164–74

57.

Vuckovic-Dekic

, Ribaric

, Vracar

. Implementation of various criteria for evaluating the scientific output of professional scientists and clinicians-scientists. Arch Oncology 2001;9:103–6

58.

Kellner

, Ponciano

. H-index in the Brazilian Academy of Sciences: comments and concerns. An Acad Bras Cienc 2008;80:771–81

59.

Mugnaini

, Packer

, Meneghini

. Comparison of scientists of the Brazilian Academy of Sciences and of the National Academy of Sciences of the USA on the basis of the h-index. Braz J Med Biol Res 2008;41:258–62

60.

Torro-Alves

, Herculano

, Tercariol

CAS

, Filho

, Graeff

CFO

. Hirsch's index: A case study conducted at the Faculdade de Filosofia, Ciencias e Letras de Ribeirao Preto, Universidade de Sao Paulo. Braz J Med Biol Res 2007;40:1529–36

61.

Epstein

. Journal impact factors do not equitably reflect academic staff performance in different medical subspecialties. J Investig Med 2004;52:531–6

62.

Rezaei-Ghaleh

, Azizi

. The impact factor-based quality assessment of biomedical research institutes in Iran: Effect of impact factor normalization by subject. Arch Iran Med 2007;10:182–9

63.

Hickie

, Christensen

, Davenport

, Luscombe

. Can we track the impact of Australian mental health research? Aust N Z J Psychiatry 2005;39:591–9

64.

Hirsch

. An index to quantify an individual's scientific research output. Proc Natl Acad Sci U S A 2005;102:16569–72

65.

Seglen

. Why the impact factor of journals should not be used for evaluating research. BMJ 1997;314:498–502

66.

Macilwain

. Science economics: What science is really worth. Nature 2010;465:682–4

67.

Kaplan

, Norton

. The balanced scorecard – measures that drive performance. Harv Bus Rev 1992;70:71–9

68.

Zelman

, Blazer

, Gower

, Bumgarner

, Cancilla

. Issues for academic health centers to consider before implementing a balanced-scorecard effort. Acad Med 1999;74:1269–77

69.

Uzoka

FME

. A fuzzy-enhanced multicriteria decision analysis model for evaluating university Academics' research output. Information Knowledge Systems Management 2008;7:273–99

70.

Smith

. Measuring the social impact of research. BMJ 2001;323:528

71.

Weiss

. Measuring the impact of medical research: moving from outputs to outcomes. Am J Psychiatry 2007;164:206–14

72.

Spaapen

, Dijstelbloem

, Wamelink

. Evaluating Research in Context: A method for comprehensive assessment. 2nd edn. The Hague: Consultative Committee of Sector Councils for Research and Development (COS), 2007

73.

Buxton

, Hanney

. How can payback from health services research be assessed? J Health Serv Res Policy 1996;1:35–43

74.

Falagas

, Pitsouni

, Malietzis

, Pappas

. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses. FASEB J 2008;22:338–42

75.

Google. Google Scholar Beta Website. See http://scholar.google.co.uk/intl/en/scholar/about.html .

76.

Thomson-Reuters. ISI Web of Knowledge Website. See www.isiknowledge.com/ .

77.

Elsevier. Scopus Website. See www.scopus.com .

78.

Bornmann

, Marx

, Schier

, Rahm

, Thor

, Daniel

. Convergent validity of bibliometric Google Scholar data in the field of chemistry – Citation counts for papers that were accepted by Angewandte Chemie International Edition or rejected but published elsewhere, using Google Scholar, Science Citation Index, Scopus, and Chemical Abstracts. Journal of Informetrics 2009;3:27–35

79.

Ashrafian

, Patel

, Skapinakis

, Athanasiou

. Nobel Prizes in Medicine: are clinicians out of fashion? J R Soc Med (in press)