Abstract
Background:
Guidance for measuring team effectiveness in dynamic clinical settings is necessary; however, there are no consensus strategies to help health care organizations achieve optimal teamwork. This systematic review aims to identify validated survey instruments of team effectiveness by clinical settings.
Methods:
PubMed, MEDLINE, and ISI Web of Knowledge were searched for team effectiveness surveys deployed from 1990 to 2016. Validity and reliability were evaluated using 4 psychometric properties: interrater agreement, internal consistency, content validity, and structural integrity. Two conceptual frameworks, the Donabedian model and the Command Team Effectiveness model, assess conceptual dimensions most measured in each health care setting.
Results:
The 22 articles focused on surgical, primary care, and other health care settings. Few instruments report the required psychometric properties or feature non-self-reported outcomes. The major conceptual dimensions measured in the survey instruments differed across settings. Team cohesion and overall perceived team effectiveness can be found in all the team effectiveness measurement tools regardless of the health care setting. We found that surgical settings have distinctive conditions for measuring team effectiveness relative to primary or ambulatory care.
Discussion:
Further development of setting-specific team effectiveness measurement tools can help further enhance continuous quality improvements and clinical outcomes in the future.
Background
Today, team-based care has become a key component of many transformations in health care delivery and emerging models of value-based care. 1 The complexity of health care services, including a continuing trend toward value-based care and pay for performance, has elevated the importance of team-based care in the deliverance of health services.
Previous research indicates that higher team effectiveness is associated with better health outcomes.2–4 The impact of high-functioning teams on quality of care, worker satisfaction, and cost of care can be substantial when it comes to surgical care, 5 intensive care, 6 ambulatory care, 7 and primary care managing patient populations with chronic conditions. 8 Despite this knowledge and growing awareness of the importance of teamwork among health care leaders, there are no consensus strategies to help health care organizations achieve optimal teamwork. 9 One of the first steps in achieving optimal team performance is the ability to measure, track, and influence team effectiveness. Therefore, systematic reviews of survey instruments measuring team effectiveness across health care settings provide an important first step.9–11
Among the previous systematic reviews on survey instruments, Valentine et al 11 inventoried and described a list of teamwork survey tools used in health care settings. Their findings indicate what dimensions of teamwork have been assessed, along with the psychometric validity of each survey. However, this study did not specify which health care setting each measurement tool and article has addressed. Other review studies on team effectiveness did not identify which survey tools should be applied in which type of setting.9,10 We are left with the question of which conceptual dimensions are most relevant to various types of health care settings and what survey instruments are most often deployed in these settings. The complexity and dynamic nature of these settings create different conditions for teamwork; therefore, understanding these specific requirements is key in developing measurement tools. There is an apparent need for guidance on understanding the contextual nature of teamwork skills and performance for different settings, including surgery, intensive care, emergency medicine, and ambulatory care settings.
This study seeks to identify validated survey instruments by clinical settings. This objective renews and complements findings from the study by Valentine et al, 11 building on psychometric properties and concepts used in available survey instruments. Our evaluation of survey instruments is supported by 2 conceptual frameworks: one rooted in outcomes research from the field of health services research 12 and the other in team theory and organizational psychology. 13 By identifying which content domains were assessed and how the domains differ by team environment, the findings of this study can assist in the development of more specialized team member training and operational design interventions directed to the most appropriate team composition, team member tasks, and responsibilities by clinical settings. The findings will also benefit practitioners who wish to ascertain the usefulness and relevance of a particular tool for their care teams in various health care settings.
Methods
Conceptual framework
Teams are defined as “two or more people with different tasks who work together adaptively to achieve specified and shared goals.” 14 (p4) Compared with teams in other industries, health care teams have more dynamic work conditions that change frequently, have to change team membership in a short-term period, have various specialized members, and have interprofessional and even multidisciplinary cultures. 9 These unique conditions vary across health care settings. Teams in operating rooms and emergency medicine are more likely to experience changes in team memberships that may be assembled ad hoc, whereas teams in primary care can have more diverse team compositions which include physicians, nurse practitioners, medical assistants, and receptionists, compared with surgical teams.9,15 Such different work conditions and diverse team compositions require integrating multiple conceptual frameworks for evaluating team effectiveness.
We employed 2 conceptual models as frameworks for evaluating the contextual nature of team effectiveness across different settings—(1) the Donabedian 12 model on quality of care and (2) the Command Team Effectiveness 16 (CTEF) model. Both conceptual frameworks highlight outcome domains in the streams of care that fit our focus on team effectiveness. Donabedian contends that it is important to identify essential elements constituting quality of care based on structure, process, and outcome. Structure refers to the attributes of the material sources, human resource, and organizational structure. Process denotes the actual activities in giving and receiving care, including patients’ activities and practitioners’ activities. Outcome refers to the impact of care on the patients’ health status. Since then, many studies have confirmed that structure, process, and outcome should be evaluated together when considering quality of health care.17–19 Therefore, we assessed the relationship among the 3 subdimensions and investigated how survey instruments were used to measure these concepts across health care settings. This comprehensive framework also aids in determining which conceptual dimensions have been understudied when measuring team effectiveness.
In addition to the Donabedian model, we applied the CTEF model to refine the framework of analysis to be more relevant to surgical teams, which have the unique attributes of action teams.20,21 According to organizational psychological literature, “action teams” refers to the teams, such as emergency medical, surgical teams, air crews, and military command and control, that require specialized professionals to collaborate in the context of high-acuity, complex tasks, ad hoc team compositions, and time-pressured conditions.22,23 As our focus is identifying validated survey instruments according to the attributes of teams, the CTEF model allows us to closely investigate key subcomponents of team effectiveness in surgical settings according to their unique team characteristics. 24
In the CTEF model, conditions include dimensions of mission framework, task, organizational characteristics, leadership, and the characteristics of team members. These dimensions are addressed in the generic team effectiveness frameworks in health care as well,20,25 but the CTEF model measures outcomes based on 2 categories—task outcomes and team outcomes. Task-related outcomes include time-error costs, task accomplishment quality, accuracy, timeliness, and error rate, whereas team-related outcomes measure team satisfaction, team norms, roles, communication patterns, motivation, attitudes, emotional tone, and turnover. By differentiating task-related and team-related outcomes, the CTEF model provides a more specific criterion for the evaluation of Donabedian structure, process, and outcomes relevant to action teams in surgical settings.
Data collection
We conducted a systematic review of the literature, searching for survey instruments measuring team effectiveness. We conceptualized 3 relevant dimensions—survey instruments, clinical setting, and team effectiveness—as search points. Compared with Valentine et al, 11 we expanded the scope of search dimensions and database to investigate team effectiveness survey tools across different health care settings. Next, we identified key terms for each dimension. For the survey instrument dimension, we selected “survey,” “evaluation,” “instrument,” “assessment,” and “questionnaire” as the search terms. For the dimension of the clinical setting, we included key terms such as “clinical,” “health care,” and “surgical” to limit our search to only the health care domains. Because our focus is team effectiveness itself, we included the key term “team effectiveness” for the last dimension. The focus of this study was to identify and evaluate survey instruments of team effectiveness related to health outcomes; the general studies on team(s) or teamwork(s) without outcome domains were excluded.
We used 3 databases for this systematic review of the literature: PubMed, MEDLINE via OVID, and ISI Web of Knowledge. We included articles using the selected key terms from January 1990 to October 2016. In every search step, we combined key terms in the 3 dimensions using AND operator (eg, “survey” AND “clinical” AND “team effectiveness”). We limited our search to titles and abstracts. This search strategy identified articles in the overlapping areas of the 3 dimensions. The first round of searching produced 646 articles of interest. After deleting duplicates (492), a total of 191 articles were included in our final review.
Next, we conducted an abstract review to select those articles that matched our predefined inclusion criteria: (1) articles should be peer-reviewed, (2) articles should be empirical studies on teams, (3) the studies should demonstrate the use of survey instruments, and (4) the survey instrument should include a team effectiveness measure. In all, 22 articles met the criteria for full-text review. Researchers extracted information from the selected articles focusing on team compositions, types of settings, types of surveys, dimensions of team effectiveness, and significant impact on non-self-reporting outcomes (if applicable). In the final step, researchers evaluated reliability and validity of survey tools for each article. Figure 1 summarizes the research approach and search steps in a flowchart using the PRISMA methodology. 26

PRISMA flowchart for systematic review.
Analysis strategy
Following Donabedian and the CTEF model, we first categorized what type of questions researchers asked health care team members based on structure, process, and outcome. Then, we identified the setting to describe organizational preconditions of team effectiveness. Two of the researchers independently reviewed items in each survey tool identified in the literature to categorize based on these criteria, following terminologies in Mathieu et al. 25 We then qualitatively assessed the subcategories based on survey items that describe consistent terminologies.
To evaluate the reliability and validity of each survey instrument, we used 4 psychometric properties: interrater agreement (IRA), internal consistency, content validity, and structural integrity. These 4 properties have been used by other researchers to assess how accurately a survey instrument captures what it is intended to measure.11,27,28
Interrater agreement and interrater reliability (IRR) measure whether different raters provide similar or identical reports when faced with same survey instruments. Particularly, when assessing survey instruments of teamwork including multiple groups of professionals, researchers should report both IRA and IRR to justify the aggregated scores at the group levels. 29 When a team has different groups of professionals, such as physicians, nurses, specialists, or administrative staff, IRA and IRR indicate whether a survey produces similar responses on the conceptual dimensions across the groups of participants. Interrater agreement is measured by the rwg index, 30 which ranges from 0 to 1, where .7 is used as the minimum threshold for an acceptable value.30,31 Interrater reliability is measured by an intraclass correlation coefficient (ICC) with a range of −1 and 1, where values of ICC should be greater than 0 as a threshold for acceptable similarity. 32 Interrater agreement and interrater reliability are the key properties in a survey measuring team effectiveness because the survey aims to assess team members’ behaviors and achievement as a group. When IRA and IRR produce satisfactory values, it assures the reliability of the survey instrument to measure a team of individuals.
Internal consistency indicates the degree to which survey items are correlated to each other. When items are strongly correlated, it assures that the survey reflects similar concepts across items. Cronbach α is most commonly used to measure internal consistency, ranged from negative infinity to 1, with values greater than .7 defined as acceptable consistency across items. 33
Content validity refers to whether a survey accurately measures the substantive meanings of the conceptual dimensions of interest. To test content validity of a survey, triangulation is highly recommended, which requires the use of different methodologies other than a survey, such as interview, qualitative field research, expert-reviewed survey, formal pretest, or pilot survey, to measure the same conceptual dimensions of interest. When developing survey instruments of team effectiveness in different settings, researchers also conduct systematic reviews to develop items and apply previously validated scales. This is the only measure of validity in this study that reflects whether a survey captures the true dimensions that are of real-world interest.
Structural integrity refers to the extent to which survey items are clustered with a high covariance. When a survey aims to measure single conceptual dimensions, all survey items should be constructed in one dimension as expected. Structural integrity reveals the number of conceptual dimensions by assessing covariance among survey items and provides the evidence of construct dimensionality. Exploratory and confirmatory factor analyses provide the percentage of variance that can be explained by the constructed factors. When the factor loading value is greater than 0.40 and eigenvalue is greater than 1.0, the structural integrity of the survey is acceptable.34,35
Ethical considerations
Human subjects were not involved in this systematic review. Ethics review and study registration do not apply. The data sets used during the study are available on reasonable request to the corresponding author.
Results
We analyzed 22 articles with survey instruments measuring team effectiveness. The articles address a variety of clinical settings, types of respondents, variables of interest, and team compositions as described in Table 1. Of the 22 articles, 19 articles administered survey questions to team members in formal/informal clinical teams or potential team members (eg, health care faculties who train clinical professions), whereas 3 articles used third-party experts52,53 and patients 46 as respondents who assessed the clinical team effectiveness.
Description of clinical setting and team composition.
Abbreviations: D, development of a survey; DV, dependent variable; IV, independent variable; MD, medical doctor or physician.
We investigated how each article used team effectiveness as a variable of interest. Of the 22 articles, 9 articles focused on the development of the instrument and performed validity and reliability testing. As Table 1 indicates, those articles assessed dimensions of team effectiveness and analyzed the relationship among subdimensions of teamwork in terms of team functions, conditions of teamwork, leadership, and team effectiveness. As those articles mainly evaluated reliability and validity of newly developed survey questionnaires, most of the articles reported psychometric properties. Of the 22 articles, 5 articles measured team effectiveness as antecedents of health care outcomes, including task performance, patient trust, patient-centered care improvement, patient discharge rates, and length of stay. Eight articles were interested in measuring team effectiveness as consequences of care management, such as team training program, a simulation-setting training program, team functioning, team attitudes, team communication, and team quality.
Six articles focused on surgical settings, including trauma resuscitations, intensive care, anesthesia care, operating room setting, and general/vascular surgery. Five articles mainly discussed team effectiveness in primary care settings. The other 11 articles had a variety of clinical settings, such as a diabetes training program, a mental health hospital, a geriatric health care setting, chronic illness programs, Veterans Affairs (VA) hospitals, home care, long-term care, and ambulatory care settings. As team dynamics on team composition, functioning, structure, and process can vary depending on the clinical setting, we qualitatively analyzed which types of conceptual dimensions were most addressed in each type of clinical setting.
Team composition is key to measuring team effectiveness in the health care setting. We investigated what types of health care professionals were involved in teamwork. As Table 1 indicates, of the 22 articles, 17 survey instruments included physicians and nurses. Other professions often included in a team are clinical specialists, social workers, administrative clerks, health care executives, consultants, midwives, occupational therapy, dietitians, anesthesia residents, and receptionists. Such diverse team compositions indicate that it is important to consider professions’ characteristics, norms, cultures, and functions in a team when measuring team effectiveness.
The psychometric properties—IRA/IRR, internal consistency, content validity, and structural integrity—indicate whether a survey is a valid and reliable measure of team effectiveness. As noted in Table 2, there were few studies that reported all 4 psychometric properties: of the 21 articles, only 4 articles (19%) reported all 4 psychometric properties. Interestingly, only 5 articles reported IRA/IRR score, and 1 of them had unacceptably low rwg, which indicates the need for more high-quality survey instruments with high validity and reliability. Compared with IRA/IRR, internal consistency was more frequently reported: 16 out of 21 articles reported Cronbach α values. When the studies aimed to develop a survey tool (or measure team effectiveness as a dependent variable), the internal consistency test was often conducted to show how the survey items were constructed. Content validity was reported in most of the studies: 19 studies report that their survey items were constructed through literature review, existing survey with validity tests, 3-phase qualitative study based on a formative evaluative approach, and expert interview. Most studies had adopted existing surveys from literature or modified survey items to fit into the health care domain. In terms of structural integrity, about half of the selected studies reported the factor loading value of their survey items, and 11 articles reported covariance among survey items with factor loading value and eigenvalue.
Psychometric properties of survey instruments on team effectiveness.
Abbreviations: ANOVA, analysis of variance; CFA, confirmatory factor analysis; EV, eigenvalues; FL, factor loadings; ICC, intraclass correlation coefficient; IRA, interrater agreement; IRR, interrater reliability; NR, not reported; rwg = James, Demaree, and Wolf’s interrater agreement indices; Var Exp, variance explained.
An article that report all 4 psychometric properties.
The results indicate that survey instruments are widely used when measuring the dynamics of teamwork and individual team member behaviors; however, when team effectiveness is self-reported using Likert-type questionnaires, team members can report biased answers on their teamwork or unintentionally overestimate their team effectiveness.54,55 To avoid any human error and bias in survey responses, it is important to measure non-self-reported outcomes, and compare these with the self-reported team effectiveness results from a survey. Only 7 articles (31%) reported non-self-reported outcomes related their survey measures, which reveal that when team members reported high team effectiveness, the objective outcomes (eg, task performance, quality of care, length of stay, and patient discharge) were also improved. Our findings indicate that the adequacy of survey instruments on team effectiveness linked to actual outcomes is still understudied.
Tables 3 to 5 show the conceptual dimensions of measuring team effectiveness in surveys across different clinical settings. First, we focused on the surgical setting to investigate what conceptual dimensions were most measured. Applying the CTEF model in addition to Donabedian model, we identified the specific subdimensions of structure, process, and outcomes with the consideration of surgical teams’ high-acuity, complex task, and time-sensitive conditions. We found that structure and outcome dimensions were most often measured in the surgical settings. Particularly, team skills, task specialization, and working conditions were the unique dimensions found in the surgical setting. In terms of outcomes, “task competency” and “would refer others to this team” were measured only in surgical settings. In the process category, “team coordination” and “value the teamwork” dimensions were most measured across the selected articles in the surgical setting. The prevalence of these dimensions confirms that the surgical teams require interprofessional coordination and that team members must value teamwork to achieve high-quality care. Following the literature on action teams, the findings indicate that survey tools in surgical settings more often focus on team skills, specialization, and coordination to assess team effectiveness. The findings also support that task performance is considered highly important and differentiated from team performance when it comes to the surgical setting.
The conceptual dimensions of team effectiveness in the surgical setting.
In the surgical settings, we used both Donabedian and CTEF framework to identify contextual nature of teamwork sills and performance.
Concepts showed only surgical settings.
The conceptual dimensions of team effectiveness in the primary care setting.
The conceptual dimensions of team effectiveness in the other clinical setting.
Common dimensions across settings.
Unique dimensions founded in the other clinical settings except for surgical and primary care settings.
Table 4 presents the conceptual dimensions measured by surveys in the primary care setting. Interestingly, primary care setting surveys are more focused on team coordination and value of the teamwork. In the process category, team collaboration, participation in decision making, and support for innovation were mostly found in surveys administered in primary care settings. This indicates that primary care settings recruit both internal and external professions to the host primary care organizations, so how they communicate with each other and how much they share goals/vision is key to enhanced team effectiveness. The right mix of team composition is also an important issue in the primary care setting. Compared with the surgical setting, the primary care setting seems more focused on team performance than task performance. The primary care setting surveys mostly measured outcome as overall perceived team effectiveness, collective team efficacy, and team members’ job satisfaction. Teamwork can be differently perceived and evaluated based on the type of clinical setting, as illustrated by the differential appearance and prevalence in conceptual dimensions surveyed across primary care and surgical settings.
Table 5 shows the conceptual dimensions of other clinical settings. Due to the large variety and range of clinical settings, we listed the type of setting under each article. Overall, we found that there were common dimensions across all these settings. Recognizing leadership, commitment to patients, and clear roles/responsibilities were found in most articles regardless of setting. These dimensions are common structural conditions that health care teams share to achieve or improve team effectiveness. In the outcome category, team cohesion and overall perceived team effectiveness can be found regardless of health care setting. This means that team-based performance is usually measured as part of a survey instrument regardless of health care setting. Patient outcomes were only addressed in the survey tools in the other settings. Patient safety was addressed in 2 surveys, whereas improved patient well-being was measured in 5 surveys. In terms of process, communication among team members and team coordination were commonly found across settings; these dimensions were found in both surgical and primary care settings as well. The findings indicate that communication and coordination can be key to promoting team effectiveness in health care where different types of professionals and specialties are required to work together.
Discussion
Effective teamwork in health care contributes to a positive organizational culture and improves patient safety and outcomes. Developing accurate methods for measuring team effectiveness will be crucial to help drive quality improvement. In addition, these methods may differ depending on the clinical setting in which they are deployed. We found that survey tools have been used to measure team effectiveness as an outcome or as a tool for developing models of team effectiveness. Most survey tools were implemented in primary care or surgical settings; thus, more work is required to develop valid survey tools in other clinical settings, such as ambulatory care, cancer care, rehabilitation service, and long-term care.
Regardless of the clinical setting, studies measuring team effectiveness using surveys should also include measures of the surveys’ psychometric properties. The inclusion of those properties adds credibility to the measurement instruments and helps future researchers study team effectiveness and develop new and improved measurement instruments. Of 22 articles, we found that only 4 articles (18%) reported all 4 psychometric properties and only 7 articles (31%) reported non-self-reported outcomes related their survey measures. This finding reveals that the adequacy of the survey instruments still needs to be assessed.
Regarding conceptual dimensions in survey instruments, we found that the focus on outcome measures in the survey instruments is different across settings. The surveys administrated in the primary care setting are more likely to focus on team performance than those administrated in surgical settings. In particular, most survey instruments in surgical settings distinguish task-specific components from team-related components when the surveys were administrated, which supports our use of the CTEF model as a conceptual framework. The distinctiveness of action teams in the surgical setting requires coordinating the different professions in a short-term period and time-pressured situation. In this dynamic process, team members need to monitor progress toward goals and provide real-time feedback so that any errors or misunderstanding are recognized and modified. 20 Additional research on surgical team effectiveness may be particularly useful because of the fast-paced nature of the operating room setting. Decisions in operating rooms are often made rapidly, with limited information, and hold serious consequences for the patient. Surgical teams comprise a variety of health care professionals including surgeon(s), operating room nurse(s), and the anesthesia care teams. Therefore, determining shared characteristics among high-functioning surgical teams would help providers and administrators improve efficiency, effectiveness, and quality across a variety of delivery models and settings. There are likely factors beyond surgical team composition that influence team effectiveness, and these could be captured through qualitative means such as survey assessments. Research to identify those factors, how they can be accurately measured, and how they affect team members and patients, needs to be explored.
Interestingly, the findings indicate that existing survey instruments are less likely to address patient outcome as a key subdimension of outcomes. Only 5 survey tools in other health care settings recognize patient safety and improved patient well-being as their subdimensions of outcomes, whereas none of the survey instruments in the surgical or primary care settings explicitly measure patient outcomes as their key conceptual dimension. This is a notable finding because teamwork and team effectiveness are highlighted in the context of value-based payment.56,57 To tie team effectiveness to value that actually improves care for patients, more attention to patient outcomes is needed when developing survey tools of team effectiveness.
Our study selection criteria were limited to the identification and evaluation of survey instruments of team effectiveness related to health outcomes and should not be construed as covering general studies on team(s) or teamwork(s) without outcome domains. Also, it is possible that the choice of conceptual framework can oversimplify or conflate distinct features of different health care settings. Our study mitigates this using multiple models to inventory the instruments under study, supplementing the Donabedian model with CTEF to account for the surgical setting.
Conclusions
The Medicare Access and CHIP Reauthorization Act of 2015 (MACRA) and initiatives in the commercial insurance market have fueled the shift to value-based payments and have driven required changes in care delivery models, including team-based care. New alternative payment models require a renewed emphasis on care coordination and team effectiveness. We report on team effectiveness measurement tools in a variety of health care settings in this article. Our findings indicate that more valid, context-sensitive survey tools need to be developed for health care settings. Our findings also reveal that patient outcomes should be addressed more thoroughly as key dimensions of outcomes when measuring team effectiveness.
In addition, we found that surgical settings have distinctive conditions for measuring team effectiveness relative to other primary care or ambulatory care. As evidenced by programs such as Enhanced Recovery After Surgery (ERAS), the Perioperative Surgical Home (PSH), and Medicare’s Comprehensive Care for Joint Replacement Model (CJR), the operating room has become a critical setting for team-based care delivery58-60; thus, more validated survey instruments focused on surgical action teams are needed. Further development of specific team effectiveness evaluation tools in various settings, such as chronic illness care, home care, long-term care, and ambulatory care, can enhance continuous quality improvements and patient outcomes in the future. Further development of team effectiveness evaluation tools specific to the health care setting can help further enhance continuous quality improvements and clinical outcomes in the future.
Footnotes
Acknowledgements
The authors wish to acknowledge our Senior Research Assistant, Jacob Kolman, for technical assistance on manuscript formatting and clarity.
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: This project was funded by the American Society of Anesthesiologists.
Declaration of conflicting interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Author Contributions
BAK framed research design and wrote the conceptual frameworks. OC collected and analyzed the systematic review data regarding the psychometric properties and conceptual dimensions. NMH and TRM interpreted the results and contributed to writing the discussion part. All authors read and approved the final manuscript.
