Abstract
Competing in today’s workforce increasingly requires earning a college degree, yet almost half of all enrolled undergraduates do not graduate. As the costs of dropping out of college continue to rise, instructor-student relationships may be a critical yet underexplored avenue for improving college student outcomes. The present study attempts to replicate and extend a prior study that improved teacher-student relationships at the high school level in a college setting. In this registered report, we test whether an intervention that highlights instructor-student commonalities improves similarity, instructor-student relationships, academic achievement, and persistence for undergraduate students in a large, diverse public university. We found that the intervention increased perceptions of similarity but not downstream relational or academic outcomes. Our exploratory analyses provide one of the first investigations suggesting that instructor-student relationships predict an array of consequential student outcomes in college. These findings show a notable relationship gap: instructors perceived less positive relationships with certain student groups, but on average, students perceived equally positive relationships with their instructors.
While educators and policymakers have commendably focused on getting more students into college, too little attention has been paid to helping them graduate. The result is that unacceptable numbers of students fail to complete their studies at all.
The goalposts for educational attainment have moved. A high school diploma no longer scores occupational and financial stability for graduates. The changing metrics of educational success have led schools and society at large to promote the norm of “college for all” (Goyette, 2008), resulting in an increasing number of U.S. high school graduates enrolling in college (Snyder, de Brey, & Dillow, 2016). At the same time, the proportion of students earning college degrees has not kept pace (Bound, Lovenheim, & Turner, 2007). Almost half of the students who enroll in a 4-year college will not earn a degree within 6 years (Kena et al., 2015). Students of color, especially those who traditionally lack opportunities, and first-generation students face even lower graduation rates than their White and continuing-generation peers (Engle & Tinto, 2008; Shapiro et al., 2017), further exacerbating social and economic disparities. As compared with high school graduates, college graduates earn $32,000 more annually (Trostel, 2017), receive more employment fringe benefits (e.g., health insurance and retirement contributions; Oreopoulous & Salvanes, 2011), and live longer (Meara, Richards, & Cutler, 2008).
So, while the “college for all” movement may have successfully increased postsecondary enrollment, it failed to account for the barriers facing students upon arrival to campus. Higher education institutions are starting to recognize that students come from diverse backgrounds, which results in matriculating students having experienced vastly different opportunities. Now, universities are being challenged to become “student ready” rather than demanding that students be “college ready” (McNair, Albertine, Cooper, McDonald, & Major, 2016). Understanding the various reasons why students struggle to persist in college can help colleges optimize student success. For instance, take research suggesting that students who perceive fewer social relationships are particularly at risk (Pascarella, Pierson, Wolniak, & Terenzini, 2004), and imagine if college campuses intentionally developed ways to cultivate more positive relationships. When deciding what student relationships to focus on in college, many have theorized classroom-based instructor-student relationships (ISRs) to be among the most important (Johnson, 2009; Tinto, 1997, 2006). As educators, policy makers, and academics become progressively more concerned with college persistence (Complete College America, 2011; Reason, 2009), ISRs may be a critical yet underexplored avenue for improving student outcomes.
Classroom relationships between instructors and their students have proven to be a key predictor of student academic and motivational outcomes, but the majority of the research to date has been conducted in K–12 settings (Brinkworth, McIntyre, Juraschek, & Gehlbach, 2018; Cornelius-White, 2007; Midgley, Feldlaufer, & Eccles, 1988; Roorda, Koomen, Spilt, & Oort, 2011; Wentzel, 1998). On one hand, no evidence suggests that the importance of ISRs diminishes for college students. Given that teaching and learning remain fundamentally social acts (Gehlbach, 2010; Goodenow, 1992) in college settings, ISRs may predict desired student outcomes strongly at the postsecondary level as well. On the other hand, college students tend to be more mature (Newcomb & Bentler, 1987), experience more academic and social freedom (Pintrich, 2004), and have stronger peer relationships (Carbery & Buhrmester, 1998; Fraley & Davis, 1997) than K–12 students, all of which could suggest that ISRs may be less integral to college students’ educational success. Perhaps many studies have explored the effects of college ISRs and have found (ostensibly unpublishable) null results, leading to a lack of publications and the appearance of no research. Or perhaps, as research conducted in K–12 settings suggests, ISRs remain an untapped lever for improving college student outcomes. Although one can easily paint pictures of how ISRs ought to matter more, less, or the same at the college level, the point is that we know little about these relationships. Likewise, we know almost nothing about whether college-level ISRs manifest the same strong associations with the constellation of student academic and motivational outcomes that they do in K–12 classrooms.
Assessing whether improving ISRs affects college student outcomes necessitates understanding how to improve relationships between instructors and students. Recent findings indicate that increasing perceptions of similarity between high school instructors and their students can improve ISRs, which correspond to improved grades (Gehlbach et al., 2016). We attempted to replicate and extend this study at a large public university by deploying a preregistered intervention to improve ISRs and measuring a variety of downstream outcomes, including grades and persistence. The intervention had no detectible effect on ISRs, but the data collected during the study provided some of the first empirical evidence of the strong correlation between ISRs and important student outcomes at the postsecondary level.
Barriers to Persistence in College
As postsecondary costs continue rising and the consequences of dropping out become increasingly debilitating, understanding how to improve college students’ academic performance and retention has become more pressing. While some students drop out of college because of academic failure, the majority withdraw voluntarily (Leppel, 2002; Tinto, 1993). The literature on college persistence suggests that other factors, including cultural barriers (Mitchell, 1997), a lack of access to informed others (Thayer, 2000), adjustment issues (Hicks, 2003; Thayer, 2000), and motivational challenges (Hidi & Harackiewicz, 2000; Petty, 2014), all contribute to student dropout. One significant and uncharted aspect of the college experience is students’ relationships with their instructors, and positive ISRs may serve as a protective factor that shields against other negative circumstances.
Classroom-Based ISRs in College
Scholars studying higher education recognize the importance of social relationships for students’ success (Kuh, 2001; Pascarella, 1980), but very few studies examine the association between classroom-based ISRs and student outcomes in college. The relationships between instructors and the students whom they formally teach clearly matter in K–12 classrooms (Gehlbach & Robinson, 2016), and some empirical studies found that ISRs become more predictive of academic and motivational outcomes as students grow older (e.g., Roorda et al., 2011), suggesting that ISRs may indeed remain influential into college.
Of course, the many differences between high school and college may cause ISRs to become less influential in college. University learning frequently limits valuable interpersonal interactions between faculty and students (e.g., large lecture classes, classes meeting only once a week, the use of teaching assistants). Instructors who teach many students each semester may reasonably devalue ISRs because they simply cannot fathom building meaningful relationships with each student in a 400-person lecture setting. Conversely, when instructor-student interactions occur only during lectures, students may not feel as though they have the opportunity to develop particularly close relationships with their instructors. Perhaps students would also rather turn to their peers for support than their instructors. Given the structural differences between college and K–12 classrooms, it is plausible that college ISRs hold comparatively less import when it comes to influencing students’ experiences. Our study addresses the currently open question of whether and to what extent ISRs predict college student outcomes.
Leveraging Similarity to Improve Relationships
Our investigation presses beyond correlational findings and explores the causal implications of improving ISRs on college student outcomes. To do so, we draw on the psychological principle of similarity as a vehicle for improving classroom relationships between instructors and their students.
Social psychologists find that social bonds strengthen when people perceive common ground with one another (Cialdini, 2009; Montoya, Horton, & Kirchner, 2008). We tend to like those who are like us—even seemingly trivial similarities can result in better relationships. For instance, groups who share the same initials work more effectively together (Polman, Pollmann, & Poehlman, 2013), and you are more likely to do favors for someone who shares your birthday (Burger, Messian, Patel, del Prado, & Anderson, 2004). Of course, more consequential similarities can also affect relationships as well. People show a preference for others who share their personality traits (Byrne, Griffitt, & Stefaniak, 1967) and attitudes (Byrne, Bond, & Diamond, 1969).
So, how can similarity be leveraged to improve ISRs? In the original high school study, Gehlbach et al. (2016) focused on influencing instructors’ and students’ perceptions of what they had in common. That is, people often fail to realize certain preferences or opinions that they might share because the topics do not come up. By making some of these (potentially undiscovered) common preferences salient, people’s perceptions of similarity with one another may increase. Research suggests that making people think that they have commonalities with another person augments liking (Ames, 2004).
Therefore, when students and instructors perceive themselves to be similar to one another, that may translate into more positive perceptions of their shared relationship. At this point, we know that positive ISRs associate with important K–12 outcomes, such as student motivation and grades, but we do not know whether or how improvements in the ISR will cause improvements in student outcomes.
Original Intervention and Current Study Goals
The original high school study increased perceptions of similarity by having instructors and students take a “get to know you” survey at the beginning of the school year. In this survey, instructors and students responded to parallel questions about their preferences and opinions about school and their personal lives. Instructor-student dyads in the treatment condition learned what they had in common with one another based on similar survey responses, while those in the control condition did not receive feedback about their commonalities. The effect of instructors and students seeing what they shared bolstered both parties’ perceptions of similarity and improved their ISRs (vs. the control group). What’s more, students earned higher grades when instructors learned what they had in common with them.
Given the promise of leveraging similarities to improve ISRs, we adapted a version of the intervention for a large public university. The study had four main goals. First, we wanted to learn whether the effects of this intervention generalize to a new population: college instructors and their undergraduate students.
Second, a successful replication of increasing perceptions of similarity between college instructor-student dyads would enable us to explore the impact on downstream outcomes. Our prespecified hypotheses focused on replicating results for the same outcomes as the original high school study (i.e., ISRs and student course grades) as well as extending the scope of the study to examine outcomes of particular relevance to the college population—college persistence (as measured by continued enrollment at the university) and semester-end exam performance.
Third, the intervention in the original high school study appeared to be more effective for traditionally disadvantaged minority students. First-generation students and students of color face more barriers to earning a college degree (Engle & Tinto, 2008; Shapiro et al., 2017) and therefore may experience greater benefits from an intervention that bolsters perceptions of the ISR and improves student outcomes. Through several exploratory analyses, we examined whether the intervention differentially improved outcomes for certain subgroups.
Finally, our exploratory analyses serve as one of the first investigations into whether ISRs predict an array of consequential student outcomes in college. The dearth of research on the association between ISRs and student outcomes at the college level contributes to classroom-based ISRs being considered as an after-thought, as opposed to a potential pathway for increasing college students’ success.
Methods
In line with the guidelines of Gehlbach and Robinson (2018), we preregistered our methods and analysis plan by submitting a statement of transparency to the Open Science Framework (see https://osf.io/emnj7/). Next, we conducted the study while updating the methods and analysis plan to reflect study details that changed after the intervention was conducted but before data were analyzed.
Intervention and Study Design
As in the original high school study, we laid the foundation for the intervention in this field experiment by giving a “get to know you” survey to college instructors and their students, using the Qualtrics survey platform. This survey asked multiple-choice questions about preferred instructor-student interactions, personal lives, and community involvement. Table 1 shows sample items from the survey.
Sample “Get to Know You” Survey Items
Upon completion of these surveys, the platform automatically matched responses between students and instructors (i.e., their actual similarities). At that point, the survey platform randomly assigned instructor-student dyads to one of two conditions:
Control group: students learned about commonalities that they had with students from another part of the country, and instructors received no feedback about those students. We explained this absence of feedback about certain students by telling instructors that we did not want to overwhelm them with reviewing information on too many students and therefore provided them with only a sample.
Treatment group: both parties within an instructor-student dyad learned that they shared seven commonalities.
The intervention itself consisted of web-based feedback on seven randomly selected similarities that instructors and students shared. For example, imagine that both members of an instructor-student dyad responded to “What do you do to de-stress?” with “Go for a walk/exercise.” Thus, if this particular similarity was randomly selected, the instructor and student would receive feedback highlighting this shared response: “No sweat. When you get stressed, you both like to go for a walk or get some exercise.” Finally, to help internalize the similarities, instructors and students completed a few brief questions about their commonalities and received a couple reminders about how these similarities might be leveraged later in the semester. For instance, a few weeks after receiving the initial feedback, instructors received an email with a reminder of the similarities that they shared with their students in the treatment condition, as well as students’ responses to a question asking which commonality was the most meaningful to share with that instructor.
Participants
We conducted the study at a large public university in California. We focused on faculty members who taught during the 2017 spring semester and their undergraduate students. The research team offered participating faculty a $150 Amazon gift card for completing all aspects of the study. Faculty members who started participating but did not complete all aspects of the study received a $25 gift card. There were no incentives for students to participate, but we did provide participating students with a pen as a token of our appreciation at the end of the semester.
Prior to the start of the semester, our initial sample consisted of 167 faculty members who requested more information on participating in the study with their enrolled students. The final sample for our main intent-to-treat analyses included 120 instructors and their 2,749 students. Of the participating eligible students, 2,112 entered the final survey, and 2,065 completed the final survey (75.1%). The number of students completing the final survey was balanced across conditions: control, 74.4%; treatment, 75.9%; χ2(1) = 0.882, p = .348. See the online Supplemental Materials for details on how and when attrition occurred to arrive at our final sample.
Instructors
Based on institutional records (and self-report when institutional records were unavailable), the final instructor sample was 78% female and 22% male, as well as 56% White, 16% Asian, 16% Hispanic/Latinx, 5% Black or African American, and 7% selected multiple categories or other. Nine percent of the instructors had been promoted to full professors, while 14% and 53% were associate and assistant professors, respectively. The remaining 23% were hired as lecturers. The average age of instructors was 44.6 years (SD = 10.9). The mean number of years that instructors reported teaching was 12.1 overall (SD = 9.4) and 7.7 at the present university (SD = 8.2).
Students
The final student sample was 60% female and 40% male, as well as 21% White, 53% Hispanic/Latinx, 11% Asian, 5% Black, and 10% selected multiple categories or other. The average age of students was 22.4 years (SD = 5.1). Eight percent were first-year students, followed by 16% sophomores, 18% juniors, 59% seniors. Over 43% of students were the first generation in the family to attend college, and 41% of the sample had transferred from another institution. According to U.S. News & World Report and this university’s promotional materials, our sample is slightly more female than the overall undergraduate composition (54%) and contains more Hispanic/Latinx students (40%).
Measures
In addition to the measures described here, we coordinated with the university’s Office of Institutional Research to collect data on student and faculty demographics, course information, course grades, and other measures of student academic achievement. The online Supplemental Materials provide details on individual measures.
Similarity
We assessed students’ perceptions of their similarity to their instructors through a six-item scale (Gehlbach et al., 2016) immediately after the intervention on the initial survey (α = .90) and then again on the final student survey (α = .91). The scale includes items such as “How similar do you think your personality is compared to [instructor’s name]?” The immediate and end-of-semester student similarity scales were moderately correlated, r = .55.
To ensure that we did not overburden instructors with too many survey items (since they had to report on individual students), we assessed their perceptions of similarity with their students via a single item on the final survey.
Instructor-student relationships
We borrowed, with minor adaptations, measures from Brinkworth and colleagues (2018). To measure perceptions of ISRs, students evaluated their overall relationship with their instructors using a seven-item scale (e.g., “How much do you enjoy learning from [instructor’s name]?”; α = .88). To have a corresponding premeasure, students completed an anticipated ISR measure that asked parallel items (e.g., “How much do you think you will enjoy learning from [instructor’s name]?”; α = .91). The two scales were moderately correlated, r = .55.
Instructors completed the full seven-item instructor version of the ISR scale on the final survey (α = .93). Again, in our attempt to minimize the burden on instructors, we asked no preliminary measures of them.
Course grade
We obtained final course grades from the university’s Office of Institutional Research. The university provides letter grades, which we transformed to a standard 4-point grade point average (GPA; A = 4.0, A– = 3.67, B+ = 3.33, etc.). Variations of incomplete, withdrawn, or failing marks were considered a 0.
Course final exam
For outcomes pertaining to final exams, our analyses focused on objectively graded final exams to ensure that any improvements in student outcomes could not be attributed to bias in instructors’ grading. For instance, instructors who perceive a more positive relationship with a student may more effectively engage with that student, leading to greater student learning and motivation. However, it is plausible that liking a student more results in instructors elevating that student’s grade without any additional learning. To differentiate between these potential mechanisms, we collected objectively graded final exam grades (e.g., graded by Scantron) to evaluate whether improvements in ISRs affect student academic outcomes due to increased student learning or instructor bias (which could artificially inflate student end-of-course grades without additional learning).
Instructors reported on students’ final exam grades in the final teacher survey. Ninety-two instructors reported grading the final exam, paper, or project themselves (n = 1,855 participating students in the instructors’ courses, n = 1,686 participating students for whom a final exam grade was provided); 1 instructor reported that a teaching assistant graded the final (n = 63, n = 62); 19 instructors reported that their finals were graded automatically (n = 674, n = 631); 7 instructors reported giving no final (n = 157); and 1 instructor did not respond to the question (n = 14).
College persistence
The Office of Institutional Research provided us with participating students’ enrollment status for fall 2017 and spring 2018. We do not include students who graduated after the spring 2017 semester in our persistence analysis (n = 559).
Control variables
For analyses exploring student academic outcomes, we controlled for student gender (because females tend to earn higher course grades than do males; Duckworth & Seligman, 2006) and prior academic performance (i.e., their cumulative GPA at the end of the fall 2016 semester).
Exploratory measures
We collected a broad array of additional measures—through the student survey, the instructor survey, and school records—for conducting exploratory analyses. Our exploratory analyses use student demographic information as well as course enrollment. We list all measures in Table 2.
Comprehensive List of Variables Collected in the Present Study
Note. The bold variables were used to test the focal a priori hypotheses and key exploratory hypotheses for this study. The nonbold variables were used for exploratory analyses. GPA = grade point average.
This “get to know you” survey asks instructors and students multiple-choice questions about their preferred school-related interactions and personal lives. These items are not measures per se, nor are they involved in testing any hypotheses. Instead, they are the basis for the intervention (the feedback forms).
These four questions ask instructors and students to reflect on their commonalities. The goal of these items is to help participants internalize the commonalities that they share. They are not measures per se and are not involved in testing any primary hypotheses.
Procedures
First, the research team emailed all university instructors an invitation to participate in the study during the fall 2016 semester. Interested instructors took the initial “get to know you” survey. Shortly after students started the spring semester (beginning January 2017), instructors introduced the study to one of their classes and asked those students who were interested in participating to complete the first survey, where their instructors’ responses were preprogrammed into the survey.
During this initial survey and after consenting to participate, students completed the anticipated ISR measures, then the “get to know you” survey. The crux of the intervention occurred after the “get to know you” survey, where students immediately received feedback on their similarities with their instructor (the treatment group) or responses to some of the same items from students at another school (the control group). To internalize these similarities, students responded to a series of brief prompts. Finally, students completed the perceived similarity measures and the demographic items. Students completed these surveys by mid-February.
After the completion of these surveys, instructors received parallel feedback and responded to similar prompts as those of the students in the treatment group. As with the students’ feedback forms, we presented instructors with seven similarities per student. Instructors who completed these forms did so by mid-March. After these forms were completed, we reinforced the treatment through a series of email reminders to help students and instructors remember their similarities and think about how they could use those commonalities to improve their social connections to each other.
During the final weeks of the semester and beyond, students and instructors took a final survey. Students answered questions about their experiences in the target instructor’s class, while instructors answered questions about each student in the target course. The research team collected school record data after the spring 2017 semester and again after the spring 2018 semester.
Prespecified Hypotheses
The immediate goal of the intervention was to bolster perceptions of similarity between instructors and students. In turn, we expected that these shifted perceptions would enhance the relationships between instructors and students and lead to beneficial downstream outcomes for students. In particular, we anticipated that students’ academic performance (as measured by course grades and final exams) would improve when ISRs were strengthened. Furthermore, we anticipated that the more positive social and academic experiences in this course would lead to greater persistence at the university. The preliminary hypotheses correspond to the findings that we attempted to replicate from the original high school study, while the primary hypotheses reflect the new, college-relevant outcomes that we anticipated would be improved by this intervention. We tested the following prespecified hypotheses accordingly (see the statement of transparency and online Supplemental Materials for details).
Preliminary Hypothesis 1: Similarity
1a: In the treatment group, students will report a greater sense of similarity to their instructor as compared with their control group counterparts. This finding will emerge immediately postintervention and in the final survey.
1b: Instructors will report a greater sense of similarity to students in the treatment group as compared with instructors’ reports of students in the control group.
Preliminary Hypothesis 2: ISR
2a: Treatment group students will report perceiving a more positive ISR (controlling for students’ preintervention anticipated ISR) as compared with students in the control group.
2b: Instructors will report perceiving a more positive ISR with students in the treatment group at the end of the semester as compared with those in the control group.
Preliminary Hypothesis 3: Course grades
Students in the treatment group will earn higher grades in the focal course as compared with students in the control group (controlling for prior year GPA and gender).
Primary Hypothesis 1: Academic achievement
For the subset of students in the treatment group who take courses with an objective final exam (e.g., multiple choice), they will earn higher grades on their final exam as compared with students in the control group (controlling for prior year GPA and gender).
Primary Hypothesis 2: Persistence
In the treatment group, students will reenroll at the university with higher rates (in the fall 2017 semester) than students in the control group (controlling for prior year GPA and gender).
Analytic Details
Exclusion criteria
We excluded responses from the participants who showed evidence of speeding through the survey without putting thought into it (e.g., Barge & Gehlbach, 2012; see online Supplemental Materials). This resulted in the exclusion of 84 students from the end-of-semester perceptions of similarity and ISR analyses and 164 instructors’ reports on their perceptions of similarity with students and the ISR.
Preregistered analysis plan
To examine the differences between our treatment and control groups, we took an intent-to-treat approach by including all remaining participants for our primary analysis. When analyzing course grades, we standardized students’ grades within courses to account for potential between-classroom differences in grade distributions. For our continuous outcomes, we accounted for nonindependence of residuals within instructors by standard errors adjusted for clustering. In line with Cumming’s (2014) recommendation, we evaluated our hypotheses by presenting and discussing 95% confidence intervals and effect sizes (not by reporting p values).
Our basic model for hypotheses with continuous outcomes is
where treatment
i
is an indicator that student i was exposed to the similarity intervention (and the instructor was exposed to the intervention for student i),
Our model for outcomes with ordinal outcomes (e.g., the instructors’ reports of similarity to their students) is
where everything is as before, except that k indexes the levels of the outcome and runs from 2 to the number of categories in the outcome measure. Note that in addition to testing the model with ordinal outcomes, we test a model with binary outcomes (e.g., for the dependent measure of persistence). In this case, we adapt this model to a standard logistic regression model. The online Supplemental Materials provide details on the preregistered analytic plan.
Exploratory analysis
In addition to evaluating the efficacy of our intervention, we conducted several exploratory analyses to provide further insights into the intervention and postsecondary ISRs. First, we conducted heterogeneity analyses to investigate whether the treatment was more effective for first-generation and/or traditionally disadvantaged minority students. We also looked at whether the treatment effect differed according to class size. Second, we took advantage of our rich data set to expose the association between college-level ISRs and students’ academic outcomes. Third, we explored whether instructor and student perceptions of the ISR differed for traditionally at-risk college students. Finally, we assessed whether the theory of change upon which our intervention was based was tenable. That is, do greater perceptions of similarity between instructors and students correspond with more positive ISRs, and in turn, do more positive ISRs result in more positive academic outcomes for students?
Results
Baseline Equivalence and Descriptive Statistics
We checked to ensure the treatment and control groups were balanced across student-level covariates (i.e., age, gender, race, grade level, transfer status, parent education level, Pell grant status, first-generation status, and pretreatment GPA). The covariates in the model did not jointly predict treatment assignment, likelihood ratio χ2(8, n = 2,468) = 5.05, p = .75, indicating that random assignment worked. In the final sample, 1,388 students were assigned to the control group and 1,361 to the treatment group, χ2(1) = 0.80, p = .37.
In Table 3, we present descriptive statistics for our main variables of interest: students’ and instructors’ perceptions of similarity and the ISR, as well as student grades. The average class size was 36.4 students (SD = 27.7). The smallest class enrolled six students and the largest class, 224 students. The average number of students participating in the study per class was 28.4 (SD = 19).
Descriptive Statistics and Correlation Matrix for Outcomes
Note. We report the course grade variable in two ways: unstandardized (where we report the grade that students received on a 4-point grade point average scale) and standardized (where we standardized students’ grades within courses to have a mean of 0 and a standard deviation of 1 to account for potential between-classroom differences in grade distributions). We excluded reports where students and/or instructors engaged in straight-line responding. The number of students who completed the beginning-of-semester similarity and anticipated ISR scales differs because the similarity scale came after students were assigned to a condition and some students had already dropped out at that point. ISR = instructor-student relationship.
p < .05. **p < .01. ***p < .001.
Prespecified hypotheses
Table 4 presents the results for the impact of the treatment on the prespecified outcomes. Note that none of the results meaningfully change when a classroom fixed effect is added and/or class size is controlled for (for details, see Supplemental Tables S1, A–H, online).
Average Treatment Effect on Prespecified Outcomes
Note. Robust confident intervals in brackets. Model 3 controls for students’ anticipated perceptions of the ISR. Columns 6–8 control for student gender and prior cumulative college GPA. All models use robust standard errors clustered at the classroom level. GPA = grade point average; ISR = instructor-student relationship; OLS = ordinary least squares.
Of the 631 students for whom teachers provided final exam grades, an additional 26 students did not have prior GPA data. The results do not meaningfully change when including the additional 26 students (and not controlling for prior-year GPA).
p < .1. **p < .01.
First, we explored whether the treatment caused students and instructors to report feeling more similar to one another (Preliminary Hypothesis 1). Figure 1 illustrates that the treatment made students feel more similar to instructors immediately postintervention, B = 0.16, d = .22, and at the end of the semester, B = 0.11, d = .15. Instructors, however, did not report feeling more similar to students in the treatment group at the end of the semester.

Student perceptions of similarity with instructor by condition. Error bars represent 95% confidence intervals.
Next, we explored the impact of the treatment on student and instructor perceptions of the ISR (Preliminary Hypothesis 2) and course grades (Preliminary Hypothesis 3). We found no evidence that students and instructors in the treatment group perceived more positive ISRs as compared with those in the control group, B = 0.01 and B = −0.01, respectively. Similarly, we failed to find evidence for our hypothesis that students assigned to the treatment group would earn higher grades than those in the control group, B = 0.06. Students assigned to the treatment group scored 0.06 SD higher on course grades than students assigned to the control group, but the confidence interval overlaps with zero, suggesting that the result is due to chance.
Given that the treatment had no impact on perceptions of the ISR, we were not surprised to find no evidence for our two primary hypotheses. Students in the treatment group did not score higher on their objectively graded final exams than students in the control group, B = 0.004 (on 4-point GPA scale), nor did they persist at a greater rate to the fall 2017 semester (a nonsignificant 0.3-percentage point difference in persistence rates).
Exploratory analyses
While the intervention did not successfully increase student and instructor perceptions of the ISR, there were several exploratory analyses that we hoped would inform future iterations of the intervention and our understanding of ISRs at the college level in general.
Heterogeneity in the treatment effect
First, given the results from the original study, we investigated whether the intervention differentially affected first-generation and traditionally disadvantaged minority students. For these interaction analyses, we included classroom fixed effects (see Supplemental Tables S2 and S3 online). The treatment appeared to have a bigger impact on first-generation students’ perceptions of similarity with their instructor than their continuing-generation peers immediately postintervention, B = 0.15, SE = 0.07, 95% CI [0.01, 0.28], n = 2,653, but the interaction effect attenuated by the end of the semester, B = 0.09, SE = 0.06, 95% CI [−0.03, 0.21], n = 2,019. As such, the treatment did not differentially influence any of our other main outcomes for first-generation students. Traditionally disadvantaged minority students did not feel more similar to their instructors than White students immediately after the treatment. The treatment effect on end-of-semester students’ perceptions of similarity, B = 0.13, SE = 0.07, 95% CI [−0.003, 0.26], n = 1,586, and ISRs, B = 0.11, SE = 0.06, 95% CI [−0.02, 0.24], n = 1,586, between Black and Hispanic/Latinx students and their White counterparts was marginally significant, but given the numerous tests, these findings are likely due to chance. We found no evidence that the treatment had a greater impact on any other Black and Hispanic/Latinx students’ outcomes, including perceptions of similarity.
One other potential source of heterogeneity might stem from college instructors teaching different-size classes (i.e., small seminars vs. large lectures), so we looked at whether the number of enrolled students in a course had any impact on the treatment effect (see Supplemental Table S4 online). We found some suggestive evidence that the more students enrolled in a course, the lower their perceptions of similarity with their instructor at the end of the semester, B = −0.001, SE = 0.0004, 95% CI [−0.002, 0], n = 2,019, and the worse they perceived their ISRs, B = −0.001, SE = 0.0005, 95% CI [−0.002, 0], n = 2,019. Each additional 10 students enrolled in a course were associated with a 0.01-SD decrease in the treatment effect on students’ perceptions of similarity with their instructor and a 0.02-SD decrease in the treatment effect on students’ perceptions of the ISR. Course size also appeared to affect how the treatment affected instructors’ perceptions of similarity with students, B = −0.002, SE = 0.001, 95% CI [−0.004, 0], n = 2,382, and their relationships with students, B = −0.002, SE = 0.0006, 95% CI [−0.003, −0.0004], n = 2,394. Every additional 10 students enrolled were associated with a 0.02-SD decrease in the treatment effect on instructors’ perception of the ISR. In terms of academic outcomes, every additional 10 students enrolled in the course were associated with a 0.03-GPA point decrease in the treatment effect in students’ objectively graded final exams, B = −0.003, SE = 0.001, 95% CI [−0.005, 0], n = 605.
Evaluating the importance of ISRs at the college level
The motivation for designing an intervention that aimed to improve ISRs was largely based on the overwhelming amount of evidence showing that relationships between instructors and students matter in K–12 classrooms (e.g., Roorda et al., 2011). While our intervention did not improve college ISRs, it did provide an opportunity to answer the open question of whether student and instructor perceptions of the classroom ISR associate with student academic performance. In the following analyses, we use ordinary least squares regression models with classroom fixed effects and control for student gender and prior achievement.
We found that students’ and instructors’ perceptions of the ISR both predicted student course grades, when controlling for student gender and prior achievement. A student who perceived a “quite” positive ISR earned 0.22 SD of a grade higher than a student who reported a “somewhat” positive ISR, B = 0.22, SE = 0.03, 95% CI [0.16, 0.29], n = 1,866. Correspondingly, students with whom instructors perceived a “quite” positive ISR earned a half standard deviation of a grade higher than those with instructors who reported a “somewhat” positive ISR, B = 0.54, SE = 0.05, 95% CI [0.44, 0.64], n = 2,190. When student and instructor perceptions of the ISR were accounted for simultaneously, the instructor perceptions more robustly predicted student grades, B = 0.48, SE = 0.06, 95% CI [0.37, 0.59], than the student perceptions, B = 0.07, SE = 0.03, 95% CI [0.01, 0.14], n = 1,698.
Turning to the objectively graded final exams, we again see that student and instructor perceptions of the ISR both predict student final exam grades. A one-unit increase in student and instructor perceptions of the ISR is associated with an increase on final exam grades by a 0.22-GPA unit, SE = 0.09, 95% CI [0.04, 0.41], n = 437, and a 0.58-GPA unit, SE = 0.16, 95% CI [0.25, 0.91], n = 556, respectively. Including both parties’ perceptions in the same model shows that instructor perceptions (but not student perceptions) of the ISR were associated with student final exam grades, B = 0.50, SE = 0.16, 95% CI [0.15, 0.84], n = 413.
Finally, the more positive that instructors perceive their relationship with a student, the more likely that the student is to persist in college (as measured by being enrolled in the subsequent semester), Blogit = 0.43, SE = 0.16, 95% CI [0.12, 0.75], n = 1,267. Students’ perceptions of their relationship with the target instructor did not predict reenrollment the next semester.
Exploring how ISRs function between student subgroups
In addition to investigating the association between ISRs and student outcomes, the data from our experiment provided an opportunity to learn more about how different groups of students experience classroom-based ISRs. Specifically, we explored whether there is a gap in instructor and student perceptions of the ISR between first- and continuing-generation students, as well as between traditionally disadvantaged minority students and their White counterparts. In these analyses, we include classroom fixed effects and control for prior achievement (see Supplemental Tables S5, A–C, and S6, A–C, online).
First-generation students did not anticipate having worse ISRs than their continuing-generation peers at the beginning of the semester, B = 0.02, SE = 0.03, 95% CI [−0.03, 0.07], n = 2,571, nor did they perceive lower quality ISRs at the end of the semester, B = 0.02, SE = 0.03, 95% CI [−0.05, 0.08], n = 2,023. Instructors, however, reported having less positive ISRs with first-generation students than students who had at least one parent who attended college, B = −0.11, SE = 0.03, 95% CI [−0.17, −0.05], n = 2,456, d = .13.
Similarly, there was no difference between Black and Hispanic/Latinx students and their White counterparts’ perceptions of the ISR at either time point—anticipated ISR, B = −0.05, SE = 0.03, 95% CI [−0.11, 0.01], n = 2,018; end of semester, B = −0.01, SE = 0.04, 95% CI [−0.10, 0.07], n = 1,592—but instructors perceived less positive ISRs with their Black and Hispanic/Latinx students than their White students, B = −0.14, SE = 0.04, 95% CI [−0.21, −0.07], n = 1,924, d = .22.
To test whether this finding was a function of instructors’ preference for high-achieving students, we included students’ final grade in the target course as a covariate. While students’ perceptions of the ISR remained the same, we found that the difference in instructors’ perceptions of the ISR is halved for first-generation students (B = −0.05, vs. continuing-generation students) and students of color (B = −0.07, vs. White students).
Assessing the intervention theory of change
Finally, we turned our focus to assessing whether the logic for our intervention was sensible. As expected, student and instructor perceptions of similarity at the end of the semester predict their respective perceptions of the ISR, B = 0.75, SE = 0.02, 95% CI [0.70, 0.80], n = 2,022, and B = 0.54, SE = 0.03, 95% CI [0.48, 0.61], n = 2,387. The prior sections demonstrate that perceptions of the ISR are highly predictive of consequential student outcomes in college. The more positive that students and instructors perceived the ISR, the higher their course grades, B = 0.22 and 0.54, respectively, and objectively graded final exams, B = 0.22 and 0.58, respectively. These associations suggest that the premise on which we designed the intervention was sound, even if the intervention did not ultimately affect the targeted outcomes. We now discuss potential reasons why the intervention failed.
Discussion
National initiatives, such as College Signing Day, encourage more and more students to enroll in college. But without a consideration of the factors that support student success during college, students will continue to drop out before earning a postsecondary degree (Thayer, 2000). Given the robust research on teacher-student relationships at the K–12 level, we targeted classroom-based ISRs as a potential buffer against barriers to academic failure and persistence in college. Specifically, we attempted to replicate a similarity intervention that improved high school students and their teachers’ perceptions of their relationships and students’ grades in a college setting. While the brief beginning-of-the-semester intervention increased students’ perceptions of similarity with their instructors in the short term, the effect weakened by the end of the semester and failed to produce meaningful long-term changes in their perceptions of the ISR. In the same way, instructors did not experience increased perceptions of similarity with students in the treatment condition after a few months. Given that the treatment effect did not persist through the semester to improve perceptions of the ISR, it is unsurprising that there were no downstream effects on student course grades, final exam grades, or persistence to the next semester.
A secondary goal of our study was to examine whether the intervention differentially improved outcomes for first-generation and traditionally disadvantaged students of color, who tend to experience worse postsecondary outcomes (Engle & Tinto, 2008; Shapiro et al., 2017). Some of our findings suggest that the “get to know you” survey was more effective at increasing perceptions of similarity for first-generation students (vs. continuing-generation students) and similarity and ISRs for Black and Hispanic/Latinx students (vs. White students), but the effects were still too weak to differentially influence ensuing student academic outcomes.
While our intervention did not produce the outcomes that we expected, our research design allowed us to provide some of the first empirical evidence that the quality of ISRs in the classroom matters for student academic outcomes at the college level. Until this point, the bulk of the research on classroom ISRs was focused on the elementary and secondary teachers and their students and demonstrated strong correlations between ISRs and many desirable student outcomes (Roorda et al., 2011; Sabol & Pianta, 2012; Wubbels & Brekelmans, 2005). It was an open question whether ISRs functioned similarly or differently in college. In our exploratory analyses, we found that student and instructor perceptions of the ISR robustly associate with student course grades and objectively graded student final exam scores in the instructor’s class. Instructor perceptions of the ISR also predict whether students will reenroll the following semester. Overall, instructor perceptions of the ISR appear to be most consequential when it comes to student academic outcomes: if instructors view their relationship with a student positively, the more likely that student will earn higher grades, score better on exams, and stay enrolled in college. This pattern appears to be consistent across many schooling levels, as research at the middle and high school levels also shows that instructor perceptions of the ISR, as opposed to student perceptions, predict student grades (Brinkworth et al., 2018).
So, despite the intervention’s lack of efficacy, the theory behind our pathway for change seems sound. Perceptions of similarity strongly correlate with ISRs, and ISRs correlate with student academic outcomes. But while students did perceive themselves to be more similar to their instructors immediately posttreatment and the effect lingered to the end of the semester, the intervention appeared to be too weak to have a meaningful impact on beliefs over a semester-long course. There are several possibilities why asking students and instructors to reflect on what they have in common for a few minutes at the beginning of the semester did not lead to long-lasting changes in the ISR. Maybe the random selection of items on which students and instructors matched in the “get to know you” survey were too trivial or not self-relevant enough to influence downstream relationships. It is also possible that the beginning of the semester is too hectic a time for instructors and students to build upon the newfound commonalities toward real-world connections. Or, perhaps, the early timing of the intervention resulted in instructors receiving information on what they had in common with a particular student (e.g., you and Maria both meditate to de-stress) before they even knew who that student was, let alone the names of all their students, making the similarity information easy to forget.
These explanations would suggest that strengthening the treatment may increase effectiveness, but the concern would be that any efforts to strengthen the intervention may make it too overwhelming for instructors to engage in with an entire class. We found suggestive evidence that the current treatment became less effective at improving student and instructor perceptions of the ISR, as the number of students enrolled in the class increased, highlighting the difficulty of administering an intervention at the instructor-student level in large courses. Therefore, boosting the intensity of the intervention could dissuade instructors from participating who teach lecture classes where enrollment is in the triple digits. The threat of more work may deter instructors of smaller, seminar classes as well. Perhaps more targeted efforts might focus on groups of students who would benefit from a relationship-building intervention like this more than others (e.g., Cohen, Garcia, Apfel, & Master, 2006; Walton & Cohen, 2011).
These data gave insights into which subgroups of students may be most worthy of focus. Specifically, we were able to explore how first-generation and traditionally disadvantaged minority students experience classroom ISRs. Like the racial achievement gap, we found evidence of an emerging relationship gap—although it manifests itself in a different way. Overall, there are no meaningful differences between how first-generation and continuing-generation students perceive the quality of their ISRs. The same is true of Black and Hispanic/Latinx students’ perceptions of the ISR as compared with White students; on average, all groups of students essentially perceived the same quality relationship with their instructor. Differences in ISR quality did arise, however, when we examined how instructors view their relationships with these subgroups of students. Instructors reported having less positive ISRs with students of color and first-generation students than their White and continuing-generation peers, even when controlling for prior achievement. This gap in instructors’ and students’ perceptions of the ISR becomes particularly concerning given that it is principally the instructors’ views that correlate with students’ academic outcomes. Therefore, if one is administering a more potent version of the treatment, it may be most beneficial to target instructors and their students from less privileged backgrounds.
While the gap in relationship perceptions persisted even when we accounted for the final course grades that students received from the instructor, the effect was cut in half. One potential explanation may stem from the demands on college instructors’ time and the structure of the courses. In college courses, where ISRs are harder to develop for so many students with limited time, instructors may (consciously or subconsciously) rely on relationship proxies, such as students’ class performance, to report on their relationship with students whom they do not know as well. In other words, when instructors are asked to report on the quality of their relationship with students whom they do not know well—or at all—the students’ grades in the course may be the only information that they have on them, which may result in conflating academic performance with relationship quality.
Of course, our exploratory correlational analyses raise the issue of causality. A major objective of our experiment was to determine whether increasing the quality of ISRs caused changes in student learning outcomes. But because we failed to improve the quality of the ISR, the question of causality remains. We still do not know why perceptions of the ISR positively correlate with student outcomes. The field needs more experimental research that evaluates whether improvements in ISRs result in improvements in downstream student outcomes.
Conclusion
Our exploratory analyses provide some of the first empirical evidence that classroom ISRs remain important in college. While there are practical differences between college and K–12 classrooms, it appears that teaching and learning are social acts no matter the level: the quality of ISRs positively associate with desirable academic outcomes for students. Amid these findings, a troubling trend emerged: there is a gap in instructors’ and students’ perceptions of their mutual relationships. Students perceive uniformly positive relationships with the instructors, while instructors perceive less positive relationships with first-generation and Black and Hispanic/Latinx students than continuing-generation and White students, respectively. This gap in perceptions of the ISR is partially, although not completely, explained by instructors reporting more positive relationships with high-achieving students.
We were unable to replicate a similarity intervention to improve classroom ISRs and student academic outcomes at the college level. However, our study does yield important evidence that deepens our understanding of ISRs. For instance, instructors and students who feel more similar to one another report better ISRs, suggesting that interventions that more strongly enhance perceptions of similarity may still be a promising vehicle for improving classroom relationships between instructors and their students.
These findings raise a number of issues for future research to pursue, and we focus on two that are particularly consequential. First, to fully comprehend classroom ISRs, we need to measure both instructor and student perceptions of the relationship (Brinkworth et al., 2018). Second, to understand the underlying mechanisms motivating the relationship gap, we need to determine the causal association between ISRs and outcomes more broadly. To do so, research on ISRs at all levels of schooling must press beyond correlational investigations to experimentally test whether and how ISRs affect student outcomes.
Across the academic life span, learning remains fundamentally a social act between instructors and students. From elementary school through college, the consistent link between ISRs and desirable student outcomes suggests that classroom relationships may fulfill a fundamental social need to connect that inspires learning and engagement. With all the focus on student outcomes, educators, researchers, and policy makers should not ignore the power of positive classroom relationships between instructors and students.
Supplemental Material
DS_10.1177_2332858419839707 – Supplemental material for Taking It to the Next Level: A Field Experiment to Improve Instructor-Student Relationships in College
Supplemental material, DS_10.1177_2332858419839707 for Taking It to the Next Level: A Field Experiment to Improve Instructor-Student Relationships in College by Carly D. Robinson, Whitney Scott and Michael A. Gottfried in AERA Open
Footnotes
Acknowledgements
This research was made possible by a generous grant from the Laura and John Arnold Foundation. The views expressed are those of the authors and do not necessarily reflect those of the funder. We would also like to thank key personnel at the participating university - we could not have done it without you. Finally, we owe Hunter Gehlbach a debt of gratitude for his guidance and continued support on this paper.
Authors
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
