Abstract
The built environment communicates value and belief structures to users. Research on gendered messaging in designed classroom spaces has shown its impact on students; thus, we sought to determine how classroom designs can have different gendered perceptions between those who use the space and those working in the design industry. In two studies, we collected survey data from undergraduates (n = 97), and then employees at design firms (n = 88) reacting to masculine versus feminine design patterns in classroom renders. The two groups exhibited strong, opposite correlations between their perceptions of femininity and sense of belonging, plus differences within the femininity scale itself. These findings show the importance of closer examinations of masculinity and femininity as gender constructs, and the need to further study how perceptions of designers differ from the perceptions of users.
Users encounter and react to the built environment as it interacts with all the dimensions of who they are, with different people having potentially contrasting experiences depending on the identities they bring into those spaces. Gender represents one such identity. Gender is distinct from sex, and in this work, we used “gender” to reference the sociocultural identities, roles, behaviors, and expressions of women, men, and gender non-binary people which must be understood in historical and cultural context (Heidari et al., 2016). A user’s gender may be more or less salient to them at any given moment, which could increase or decrease its influence on their experience in an interior environment. The environment plays an active role in this interaction, with the potential to embody and express gendered patterns introduced during design (Cheryan et al., 2009; Nosek et al., 2002).
There remains a need for design investigation that incorporates appropriate interpretations of gender constructs in both the study methodology and in the description of the subjects of research. A key practice area calling for this further study is the design of classrooms, as they interact with most young people during important developmental periods (Fischer & Good, 1994; Streitmatter, 1985; Strickland & Hadjiyanni, 2013). Gender messages embedded in these environments can affect student users, impacting their sense of self, place, and belongingness (Master et al., 2016). The processes by which individual design choices lead to gendered messaging are complex, but the examples of masculine defaults in learning environments are already empirically demonstrated (Cheryan & Markus, 2020; El-Hout et al., 2021). An important, and currently understudied, element in perceptions of designed interiors is the alignment/misalignment of perceptions between the professionals who design the space (designers) and the eventual users of the built environment (students). With the popularity of design paradigms like design thinking that emphasize user empathy, scholarship exploring how perception differences could produce such misconceptions (Köppen & Meinel, 2014) is needed. The purpose of this study was to compare the perceptions of designers versus students of gendered classroom spaces to investigate whether mismatches occur and could give rise to unintended design impacts that could affect a sense of belonging, which is cited as a critical component of academic success (Allen et al., 2021; Moallem, 2013; Walton & Cohen, 2011).
Literature Review
Theoretical Framework
Research in education has increasingly emphasized the importance of school context (e.g., classroom design, instructional practice, class and campus culture, etc.) as an essential component of understanding the effect of educational interventions and performance more generally (Kaplan et al., 2020; Walton & Yeager, 2020). Motivational Climate Theory (Robinson, 2023) provides a lens to understand classroom design as a potential influence on motivational climate and microclimate, or the contextual features that shape motivational beliefs for a class of students (climate) or individuals and subgroups within a class of students (microclimate). For this research, we focused on subgroups that is, the microclimate, defined by student gender.
Gender in Design
Research on the perspective of developmental patterning suggests that children’s awareness of gender status and discrimination is not fully apparent until the later years of elementary school (Martin & Ruble, 2010). As children mature, so does their conceptualization of gender categories (Vasquez et al., 2023). These gender categories then allow for understandings of gender labels like girl, boy, woman, man, lady, guy, etc. (Zosuls et al., 2009). Social Learning Theory describes gender roles and stereotypes as learned through behavioral imitations (Bandura & Walters, 1977). These are rooted in a gender belief system, and what we consider to be masculine and feminine in society (Lorber, 2001). Baig (2014) found that students begin to perceive their identity through a girl/boy dichotomy shaped by gender stereotypes that teachers may bring to the classroom. This had largely to do with the behavioral expectations teachers had of boys versus girls that created a barrier between genders, as boys began to reject girls as classmates or teammates. In fact, Mac an Ghaill (1994) described schools as “active makers of a range of femininity and masculinities” (p. 9).
With older learners, the marginalizing impacts of gendered environments have been documented in conjunction with attempts to move toward more equitable teaching practices. For example, active learning in undergraduate education has been documented for its wide variety of benefits, particularly for women (Clinton & Wilson, 2019; Freeman et al., 2014; Ralph et al., 2022). However, research has also shown how within these active learning environments there can persist gendered patterns of exclusion that continue to marginalize women (Reinholz et al., 2022) and that the environment can play a role in exerting those marginalizing pressures (Casad et al., 2019; Cheryan et al., 2013). Cheryan et al. (2013) specifically call out the potential role of the environment in identity formation for students with less experience in a given setting, as they search for cues to fill information gaps. In another study by Cheryan et al. (2009), objects like Star Wars posters and video games were strongly associated with stereotypical perceptions of computer science. As a result, those cues led women to avoid those spaces while subsequently reporting them as significantly more masculine.
Analyzing the effect of a professional environment has the potential to be instrumental when looking at performance and gender-friendly space design. While advocates for greater gender equity continue to work toward increases of inclusion throughout education design, architectural designers should recognize Agarwal’s (2018) observation that design will target a dominant group so long as a dominant societal group exists. In this context, students learning within these designed spaces may internalize either inclusive or exclusionary attitudes about themselves and others with regard to their participation in classroom activities.
... design will target a dominant group so long as a dominant societal group exists.
Students’ identification with their environment evokes the construct of belonging. Student belonging has been linked to undergraduate persistence (Hausmann et al., 2007; Moallem, 2013), academic success (Moallem, 2013; Walton & Cohen, 2011), and improved life outcomes like career satisfaction and community involvement over 10 years later (Brady et al., 2020). A review by Allen et al. (2021) highlights the fragmented treatment of the belonging construct across studies, so we apply their framework for understanding belonging in this investigation. Their framework includes four components of belonging: competencies, opportunities, motivations, and perceptions. We summarize these components in sequence to be a person’s ability to engage in belonging, events in time when they can engage in belonging, their internal desire to engage in belonging, and their awareness and feedback on their attempts to belong. We focus on opportunities and perceptions of belonging, as we investigate students’ expectations of opportunities to experience belonging based on their interpretations of hypothetical experiences in rendered spaces.
Designing for Belonging
A review of “sense of belonging” by Allen et al. (2021) illustrates the potential that belonging can have as a target for interventions that affect students across measures of health and psychological well-being. Definitions of “belonging” often trace back to fundamental work like Goodenow (1993), but Allen et al. (2021) point out an essential oversight of most research to-date in the relevant connection to place as another component of the complete belonging construct. Belonging can be understood as a relatively stable trait, but also as a more fluid state that changes between situations and contexts (Ma, 2003; Walton & Cohen, 2011). While some situations are more transient throughout an individual’s life, the repeated exposure of students to learning environments over the duration of a semester or school year suggests a recurrent influence on a student’s sense of belonging as a state that could plausibly shape a more stable sense of belonging as a trait. This potential mechanism of impact may represent a force by which members of historically excluded groups experience marginalization pressure, with research showing that a lack of belonging is more intense for people with membership in historically excluded groups (Walton & Brady, 2017).
In this work, we examine belonging’s relationship to gender in classroom design. Previous studies investigating gender and belonging have found significant influences from environmental factors with undergraduate students in fields with systemic problems of gender representation. For example, Cheryan et al. (2009) showed how environmental changes that eschew stereotypical decor can impact students’ sense of ambient belonging (and interest in) computer science. Similar issues have been described in other science, technology, engineering, and math (STEM) fields (Lewis et al., 2016; Rainey et al., 2018; Steffens et al., 2010). A previous body of research provides some precedent for examining belonging in a variety of contexts (Allen et al., 2018), but more work is needed on operationalizing masculinity and femininity as objects of measurement in order to understand how differences in user perceptions may inform the way the designed interior shapes sense of belonging.
... more work is needed on operationalizing masculinity and femininity as objects of measurement in order to understand how differences in user perceptions may inform the way the designed interior shapes sense of belonging.
Study Purpose
Our investigation focused on differences in perceptions of gendered classroom spaces between those who learn in the spaces (undergraduate student users) and those who create those spaces (design professionals). With each sample, we measured participants’ perceptions of femininity and masculinity in multiple space designs using newly constructed scales and tested for associations between the gendered constructs and their sense of belonging. This method allowed us to address three research questions with our findings.
How do perceptions of gender and sense of belonging differ between classroom designs intended to express more or less femininity and masculinity?
Is there a correlation between perceptions of gender in space design and sense of belonging, either overall or in specific gendered subgroups?
Are there differences in the perception of these gender constructs between those who use the designed spaces (students) and those who create the designed spaces (professionals).
Method
Study Spaces
Each participant was asked to respond to four study spaces, represented as a pair of computer-generated renderings for this research (Figure 1). The authors designed the spaces based on informal preliminary study that included workshops, focus groups, preliminary surveys, review of tools like Building Without Bias (Rozenberg, 2022), and their own intuitions of gender in classroom space. 1 Space 1 represents a classroom intended to strongly evoke masculinity and the absence of femininity and includes design decisions like the use of a black and white color palette overlaid on angular/linear space features. Space 2 includes wood furniture features and an earth tone palette. Space 3 more heavily incorporated “feminine” elements like color on the walls and seating in addition to distinctly curvilinear tables. Finally, Space 4 was intended to heavily emphasize femininity and minimize masculinity with soft colors, additional windows, the exchange of a secondary chalkboard with internal vision panels, and shifts away from dark woods.

Rendered views into each of the four study spaces.
Instrumentation and Construct Definitions
After an initial information statement, the questionnaire presented images from one of the four possible study spaces chosen at random. Then, 14 single-word prompts were given (in random sequence) for which the participant moved a sliding scale to indicate how much or how little they felt each word associated with the shown space. The same study space images were then shown once more, followed by four Likert-type items comprising the “sense of belonging” scale (themselves in random order). This full sequence was then repeated three more times, so each study space was shown with the full block of items. The last items, questions 74 and 75, generated gender subgroup and age data. The full instrument was reviewed and approved by an Institutional Review Board. Details of the presentation and structure of the items themselves are available in the Supplemental Materials.
We operationalize “femininity” as the complex social understanding of traits and ways of being broadly coded to being or enacting womanhood. Femininity as a construct is extensive and variable across time, space, and culture. Within this study, we do not seek to comprehensively define femininity, but to sample it with the following terms: female, feminine, nurturing, inspiring, and motivating. These terms comprise the femininity scale, which is only intended to sample femininity and not measure it comprehensively. As a final clarification, expressions of femininity are not limited to women—gender non-binary people and men can and do enact expressions of femininity.
We similarly operationalize “masculinity” as an understanding of traits and ways of being coded to manhood. We sample masculinity with the terms male, masculine, and dominant. This scale is smaller because of empirical revisions intended to improve scale independence and empirical reliability, reported below. Again, the masculinity scale only samples masculinity, and any participant may enact elements of masculinity regardless of their gender.
Finally, we operationalize “sense of belonging” as the expectation of a participant to feel at ease, comfortable, and socially connected in the learning environment they were shown. These items evoked considerations of both the physical space as shown and the people and social interactions they would expect to encounter within the space. Belonging has a greater precedent of study, and while we created the questionnaire for this project, we did draw upon existing scholarship for construction guidance (Goodenow, 1993; Knekta et al., 2020; Trujillo & Tanner, 2014; Walton & Cohen, 2007; Yorke, 2016).
Population and Sample
Study 1: Undergraduates
In Study 1, we collected data from undergraduate students at five institutions of higher education across the United States (names of institutions are withheld for anonymity purposes, but these institutions were spread across the United States). Several locations only contributed a small number of responses, while two locations contributed the majority represented in our sample. Those two locations represent distinct regions of the United States, although neither is located on a coast. Recruitment was conducted in collaboration with instructors at each cooperating institution by distributing invitations to participate through email and learning management systems. There were no other requirements for participation beyond an academic designation of “undergraduate,” and we had a variety of majors and ages in our sample.
Table 1 shows our sample characteristics for Study 1, with 231 usable responses across all treatments. Our sample is comprised of mostly women (71% based on the use of “she” pronouns), which is consistent with previously documented patterns of gendered bias in willingness to participate in research (Curtin et al., 2000; Moore & Tarnai, 2002; Smith, 2008). We considered this gender imbalance during analysis when making comparisons between gender subgroups. Participation in the various treatment spaces was balanced through randomization, and further sample details are available in the Supplemental Material.
Descriptive statistics for samples in Study 1 and Study 2.
Notes. UG = undergraduates; Pros = professionals.
Study 2: Professionals
In Study 2, we collected data from two design firms in North America; each of which includes substantial professional interior design and architectural studios. Both firms were comprised of multiple studio locations spread across the United States and Canada. These companies were chosen for convenience, given existing relationships with the researchers. Both companies contributed a similar number of participants to the sample (n = 45 and 43), and we did not collect data on the precise geographic location of individual participants. Recruitment involved the distribution of the researcher-created invitation to participate in a volunteer questionnaire through each company’s human resources personnel.
Our participant criteria listed only that the individual be employed at the company, and we did not collect information on each person’s role or job title during the study. Table 1 also shows our sample characteristics for Study 2, with 297 usable responses across all treatments. Two potentially important differences from the sample in Study 1 were (1) the distribution of participant ages is thoroughly spread from young adulthood to retirement age (≈60 years old) and (2) recruitment included roughly equal numbers of women and men but failed to recruit anyone who identified as gender non-binary (by using only “they” pronouns). Most participants completed the study in English, but roughly a quarter of submissions were in French (see Supplemental Material for multilingual implementation).
Analysis
Our study analysis comprises three parts in each study: instrument evaluation, scale analysis, and subgroup analysis. In our instrument evaluation, we used structural equation modeling to evaluate the empirical measures of our questionnaire’s validity and reliability. Our primary tools for these analyses included the R statistical software v 4.1.1 (R Core Team, 2022) and the lavaan package (Rosseel, 2012) for structural equation modeling in R. We also used dynamic model fit indices based on our instrument’s factor structure, which afforded us more precise judgements of model fit (McNeish & Wolf, 2021; Wolf & McNeish, 2022) (see Supplemental Material).
We analyzed results from our scales by first calculating mean scores for each set of items (femininity, masculinity, and belonging) for responses to each of the four study spaces. We next compared the 95% confidence intervals (estimated with 2.00 standard errors) for each space. We supplemented our visual analysis of confidence intervals by conducting an analysis of variance (ANOVA) and post hoc pairwise comparisons which included both independent 2-group t-tests and Tukey’s Honest Significant Differences between relevant combinations of study spaces to examine potentially significant differences between groups on each scale.
Next, we calculated correlations between the three measured scales to understand potential connections between femininity, masculinity, and belonging at the individual level. We used Spearman’s rho because it requires fewer assumptions about the data; in our case, an important consideration is that most of the data for each scale are not normally distributed as determined by a Shapiro-wilk test of normality 2 (detailed in the Supplemental Materials).
Finally, we looked at the same mean scale scores across gender subgroups (as classified by reported pronouns). In this case, our analysis relied first on data visualization and confidence interval comparisons to identify potentially significant differences. We then compared potentially different groups with a single paired hypothesis test (e.g., t-test) to describe the difference more precisely between the groups. Considering the large number of potential group comparisons in this setting, we guarded against Type 1 errors by requiring more rigorous alpha thresholds (e.g., p < .01) and the theoretical plausibility of preliminary results. In alignment with our application of QuantCrit (explained below), we have discussed these considerations explicitly in the results below. While we also considered typical thresholds for significance (e.g., p < .05), we also heeded calls for more contextualized, nuanced discussions of significance based on both our QuantCrit framework and movements in statistical practice generally (Ho et al., 2019; McShane et al., 2019; Wasserstein & Lazar, 2016; Wasserstein et al., 2019).
Application of QuantCrit
QuantCrit is a theory of quantitative method formalized as an application of Critical Race Theory (Garcia et al., 2018; Ladson-Billings & Tate, 2016). Critical quantitative methods predate QuantCrit’s formalization, but this framework provides explicit language and practical guidelines for using quantitative methods in the pursuit of social justice. A key feature of QuantCrit is its proposal for five tenets of practice, which we have adapted in our work:
Marginalization is Central—maintain a focus on the mechanisms of oppression you seek to interrupt.
Numbers are not Neutral—the act of quantification involves choices made by humans and always involves decisions made by people.
Categories are Human Constructions—categorization is also a human process and does not represent “inherent” or “natural” traits of the subjects.
Voice and Perspective Matter—data do not “speak for themselves” and the act of interpretation should include consideration of the perspective of those affected by the research/phenomenon.
Statistical Analyses have no Inherent Value—discovering statistical results is not inherently valuable, but should be a tool for improving the circumstances of people (Garcia et al., 2018).
We recognize the diversity of gender identities held by potential participants, and consequently reject any measurement approach that assumes gender to be a binary construct (i.e., only men and women).
Our application of QuantCrit focused on the assertion that categories are human constructions, in this case, the definition of and method of categorization for subgroups of gender. Critiques of psychology research have appropriately criticized problematic approaches to measurement of gender (Cameron & Stinson, 2019; Lindqvist et al., 2021). In our adaptation of the tenets of QuantCrit, we strove to build a measurement approach for gender both useful to our research purpose and consistent with our understanding of gender constructs. We recognize the diversity of gender identities held by potential participants, and consequently reject any measurement approach that assumes gender to be a binary construct (i.e., only men and women).
We also discard the assumption that femininity and masculinity necessarily operate as mutually exclusive, antagonistic constructs. Instead, we have constructed a measurement approach for femininity and masculinity in classroom spaces capable of generating data independently of one another. Through this measurement approach, we can observe correlations where they exist based on participant responses, without assuming or inserting them into the measurement process.
Finally, we constructed our gender categorization item in a way that allows subgroup analysis without invoking inappropriate or unnecessary facets of gender, or distinct but related constructs, from participants (Lindqvist et al., 2021). We asked for participants to identify their pronouns, as it is an increasingly common behavior with relevance to our considerations of gender that reduces description errors (such as misunderstanding gender/sex) while providing interpretable subcategories.
Positionality
We are an author team of multiple gender identities with experience in classrooms as learners, teachers, designers, and architects. We come to this work through a mutual commitment to improving classroom design and design research to make the classroom experience more inclusive for learners of all genders, considered within an intersectional theory of the human experience.
Results
Study 1
Instrumentation Evaluation
We conducted an exploratory factor analysis (EFA) and judged empirical support for three factors. Our prior theoretical structure predicted four factors, but the EFA results indicated two of our expected scales were in fact behaving as one with strong factor loading on a common factor. In response, we revised our model structure to eliminate redundant items in a three-scale model and conducted a confirmatory factor analysis (CFA) to evaluate the model fit of our revised scale. Using Dynamic Fit Indices, we adjudged our full model to be approaching acceptable fit (Table 2). Despite the full model requiring further revision, the individual scales produced indications of good reliability coefficients: ω = .85–.92 (McDonald, 1999; McNeish, 2018). EFA, model fit, and scale reliability details are available in the Supplemental Materials.
Measures of the CFA model fit for study 1 using dynamic fit indicies.
Notes. SRMR = standardized root mean square residual; RMSEA = root mean square error of approximation; CFI = comparative fit index.
Meets “acceptable” threshold.
Meets “ideal” threshold.
Scale Analysis
In the undergraduate sample, we found the femininity and masculinity scales behaved as we expected based on our design intention; masculinity gradually fell across spaces while femininity gradually rose (Figure 2). We conducted an analysis of variance (ANOVA) comparison between studied renders that showed significant differences for both gendered scales (femininity: F[3, 226] = 6.00, p < .001, masculinity: F[3, 222] = 11.02, p < .001). Subsequent direct comparisons of treatment confidence intervals (see Figure 2) revealed the empirically significant differences were between non-adjacent spaces (e.g., Space 1 and 3, Space 1 and 4), which included more substantial design differences than adjacent spaces. However, we did not observe differences in sense of belonging among these spaces (F[3, 225] = 1.30, p = .277) (Figure 3).

Scale scores for each study space (4) on each study scale (3) in Study 1.

Pairwise comparison of scales between treatment spaces (Study 1).
These findings show the gendered scales are sensitive to differences between spaces broadly, but they are not currently able to resolve distinctions between the most similar spaces (i.e., adjacent spaces in this design). We also note Space 1 produced the lowest mean score on the belonging scale overall, but this observation requires further empirical study given the ANOVA results above.
We next examined the correlations between these three variables of interest across individual responses (Figure 4). We measured a strong, positive Spearman correlation between the Feminine scale and the Belonging scale (ρ = .471, p < .001). We also measured a relatively small but significant correlation between the Feminine scale and the Masculine scale (ρ = .174, p = .009). However, we did not find any meaningful correlation between masculinity and belonging (ρ = .020, p = .768).

Overall scale Spearman correlations in Study 1.
Subgroup Analysis
We analyzed the Study 1 results by gender subgroup, defined by whether the respondent reported use of she/they/he pronouns. These groups were unbalanced with the majority of respondents being women. Most scale scores did not present significant differences between gender subgroups, with two potential exceptions. In Space 3, women reported a much higher Femininity scale score compared to men and gender non-binary students (which were equivalent to one another). While a post hoc 2-group (men vs. women) independent t-test assuming unequal variance returns a conventionally significant difference for this result (t[15.75] = 2.16, p = .047), 3 we hesitate to interpret the result any further given the repeated comparisons and potential for this finding to be a false positive. The second potentially interesting subgroup finding is the response of gender non-binary students to Space 1. In this case, they reported a potentially lower sense of belonging, lower perception of femininity, and higher perception of masculinity. These could be taken together to suggest a heightened sense of masculinity as a marker of patriarchal power dynamics which could impact a sense of belonging. We share this pattern as an opportunity for future research, not as a definitive statistical finding in this study (the subgroup size is far too small to make confident claims about the stability or potential significance of this observation) (see Supplemental Material for subgroup comparisons).
Study 2: Professionals
Instrumentation Evaluation
We began with a confirmatory factor analysis (CFA) based on the scale structure from Study 1. In the femininity scale, we noted that two of the items have substantial drops in their factor loading (Inspiring: .898–.370, Motivating: .786–.416). These drops strongly suggest the items do not connect with the professional participants’ concept of femininity in the way they did for the undergraduate participants. Similarly, we identified one of the items in the masculinity scale also dropped to a level that suggests it is no longer connected to the same idea of masculinity (Dominant: .610–.410). One final note, the model returned a nonsensically high value for one item in the masculinity scale (“masculine”) which should also be considered by researchers who consider adapting and revising these scales for use in professional samples in the future.
The reliability coefficients for each scale showed a similar drop: ω = .77 to .86. While lower, these omega values still surpass the conventional ω > .70 threshold for acceptability. Recognizing the tension between some highly suspect item alignment and acceptable overall reliability, we will interpret results from these scales but consider their need for revision in our analysis.
Scale Analysis
In Study 2, we saw the scale behavior across treatment groups become less differentiated (Figure 5). Sensitivity to femininity statistically disappeared, with only a suggestive distinction between Spaces 3 and 4 that fails to reach a threshold for significance on its own. Masculinity remained closer to the trend we observed in Study 1, with a shift from a gradual change across spaces to a more definitive drop in masculinity between Spaces 2 and 3. Finally, the sense of belonging scale included more variability between spaces than in Study 1 and was statistically significant (F[3, 293] = 3.79, p = .011). The pairwise comparison (Figure 6) showed the result came primarily from a lower score in Space 2 compared to the other spaces. The inadequacy of theoretical explanation for this result (outside of random variability in responses) and the potential for a Type 1 error (false positives) given the large number of pairwise comparisons in this design leads us to approach this result with caution. We withhold interpretation until further study replicates this finding.

Scale scores for each study space (4) on each study scale (3) in Study 2.

Pairwise comparison of scales between treatment spaces (Study 2).
When comparing relationships between scales, we discovered the strongest relationship we had observed in the undergraduate sample had both strengthened and reversed direction in the professional sample. The Spearman correlation between the feminine and belonging scales in Study 2 was strongly negative (ρ = −.528, p < .001). The other relationships between variables were comparable to Study 1 (Figure 7). This large shift in the relationship between femininity and belonging is consistent with the issues in the model fit that suggest the feminine scale is sampling a meaningfully different femininity construct with this population compared to Study 1.

Overall scale Spearman correlations in Study 2.
Subgroup Analysis
In Study 2, we again compared scale results between gender subgroups, but only had participants identifying as women and men, so there are no results for gender non-binary professionals. Results uniformly indicated there were no significant differences between women and men on any scale in any study space (see Supplemental Material).
Discussion
Our first research goal was to measure differences in the perception of gender constructs between classroom designs. Our findings were closely consistent with our expectations about the behavior of both the femininity and masculinity scales across spaces intended to evoke either construct. However, the lack of changes in the users’ sense of belonging in these spaces was not consistent with our expectations. The movement on the other scales provides some assurance on the impact of the treatment. Both the strong reliability measures and the alignment with precedent studies (Allen et al., 2018) gives us confidence in the belonging scale. This leaves us with a working explanation that the manipulations in classroom design alone may not be adequate to elicit measurable changes in a relatively stable construct like belonging, and it is only when these designs interact with experiences of instruction that their effects become more significant. This explanation is supported by a recent review of belonging research that similarly reported a multi-process mechanism for producing, and thus impacting, a stable, trait-like sense of belonging (Allen et al., 2021).
When considering the trend for most comparisons taken broadly across both studies, the differences between the women and men for most scale scores were not significant. This indicates that respondent perception is not necessarily typically linked to the gender of the respondent. We identified two potentially interesting individual findings from the undergraduate sample that diverge from this overall trend. The first was a single particularly large difference in the perception of femininity in Space 3 by those who identified as women undergraduates. This finding may have some potential theoretical implications, such as perhaps showing that spaces with stronger representations of femininity and masculinity together elicit greater differences in perceptions among users of different gender identities compared to other spaces. We also noticed a suggestive trend in how gender non-binary students perceived the most masculine space—Space 1. Both findings are preliminary given the large number of comparisons among subgroups, and we advise the reader to interpret with caution until they are replicated in future work.
Viewing these findings through the lens of Motivational Climate Theory, we believe our results inform an understanding of both the motivational microclimate and the larger motivational climate (Robinson, 2023). Our data suggest the individual spaces are indeed viewed differently by different subgroups (undergraduate users vs. professionals). However, there were few differences in the perceptions of respondents between gender subgroups. The lack of differences in most measures between genders may suggest a substantial role at the group-level (classroom or school) for perceptions of the classroom space—what Motivational Climate Theory would call the “Motivational Climate.” The substantial number of findings, like those from Cheryan et al. (2013), indicating measurable impacts of environment on identify formation lead us to believe the potential activity of environmental design may primarily occur at longer timescales. The longer time for influence would also be consistent with findings from Allen et al. (2021) showing the primary role of the trait-like sense of belonging.
Our second goal was to measure the associations between the femininity, masculinity, and belonging scales. A theoretical aim of the study design was to create a method of measurement that did not assume femininity and masculinity were opposites by default, and our results showed that in both populations the two scales indeed did not exhibit a negative correlation for individuals. If a respondent in either group produced a higher score for femininity, they were more likely to also generate a higher score for masculinity. This finding suggests the perception of any gendered design characteristics increases the salience of gender, which is detectable in each of the two scales.
The third research goal was to examine the relationship between gendered perceptions and an expected sense of belonging between distinct populations—users (undergraduates) and designers (professionals). The key finding from the two studies used together to address this goal was a disparity in the behavior of the femininity scale between undergraduates and design professionals. We observed fundamentally different boundaries around what terms were interpreted with a gender connection (with items loading to the feminine scale in the undergraduate sample loading poorly to the scale with professionals). We also observed strongly opposing behavior of the femininity scale between the two samples in association to their expectations of belonging. There are multiple plausible explanations for the difference. We believe the stronger potential clarification is the shared experience in the design industry—through training and practice—of the professional sample compared to the more substantial variation in lived experience for the undergraduates. Architecture has long been a profession dominated by men (Caven et al., 2016; De Graft-Johnson & Manley, 2019; Milwid, 1982; Schnabel, 1993), and socialization in the profession that reproduces a male-as-normal paradigm could explain variations in perceptions of femininity in spaces (Campuzano, 2019; Cheryan & Markus, 2020; Seron et al., 2016). An alternative explanation may be due to the age distribution of our samples and the resulting lived experience attributable to generational trends. These two explanations could be readily discerned through future studies looking at additional samples among design students or more general samples of professional populations.
Project teams should consider how femininity may be perceived by their team members and future users throughout the design process. Professional interior designers and architects are the people creating the learning environments. Without robust engagement with student users, project schedules that facilitate co-design, and an internal culture of reflexivity (Köppen & Meinel, 2014), it is easy to imagine negative perceptions of feminine design strategies could lead to interior design projects that reproduce masculine defaults in the classroom environment. Viewed through the lens of Motivational Climate Theory, interior design that eschews feminine design may consequently result in an educational microclimate that undermines student motivation (Cheryan et al., 2013; Robinson, 2023). That in turn underscores the importance of user voices—students—in the design process. There are many benefits to having representation of many genders on a design team, but that representation does not replace the need for student voices throughout the design process.
it is easy to imagine negative perceptions of feminine design strategies could lead to interior design projects that reproduce masculine defaults in the classroom environment.
The differences in the internal behavior of the femininity scale in either sample suggests the construct itself is understood differently between the groups, which may be the result of age, design training, life experience, or any combination of these and other things. Despite the varied relationship with femininity in each group, there was a functionally identical relationship with masculinity. The presence of significant correlations in every other pairing raises interesting questions about the theoretical mechanism by which femininity has an independent connection with each of the other two constructs measured. This theorizing is beyond the scope of this study but will have implications for future refinement of the scales.
Limitations and Implications
The findings from this study imply that gendered messaging is perceived differently between people at different developmental stages—between undergraduates and practicing professionals. Future studies will need to further elucidate the behavior of these gender constructs with different age groups; we would be particularly interested in results that describe a potential window of change for these constructs among graduate students. There is also the question of how the general population scores on these scales, and whether those population parameters are similar to either of our samples. Plausible narratives exist to explain why general scores may be (1) similar to the undergraduate associations (undergraduates are adults and their responses generalize to all adults), (2) similar to the professional associations (the age distribution of the professional sample is more representative of the age of the general population), or (3) different from both our samples (both undergraduates and professionals have relevant selection criteria that produce a diverse experience from most adults). Research from Coppock et al. (2018) showed effects in convenience samples replicate in representative samples, but the theoretical potential for differences combined with our observed change in directionality for some associations leaves this an important unanswered question from our study.
We created the questionnaire used to generate data for this work, and it remains in need of further development. Our instrumentation shows promise, with strong internal scale reliability and plausible variability in response to different space designs. Our findings support a theoretical re-examination of femininity, especially regarding the empirically driven addition of “inspiring” and “motivating” to the study scales. This future research may include qualitative work to understand why these terms connect with femininity, which may reveal more precisely aligned terms to replace them that could improve scale validity/reliability. We can imagine these terms may code feminine with undergraduates as they overlap with the importance of mentorship for undergraduates (McKinsey, 2016).
Despite these areas in need of additional work, the current scales’ measures of reliability are encouraging and the consistency of the empirical findings with theoretical expectations are further evidence of the tool’s viability for research use. However, the measures of model fit fail to reach typical standards for instrumentation that can be used at scale or for comparison between studies. That refinement will require resolution of the feminine construct in undergraduate populations, potentially increasing the number of scale items for both the masculinity and femininity scales to increase reliability, and additional work with a multilingual team to translate the instrumentation strategy to languages other than English. This kind of future measurement work should include the collection of convergent evidence, which may be done relative to constructs like loneliness or social fit.
Finally, the space renderings used for comparison were categorically different. The methods were not created to describe the potential interaction between specific design parameters, and future research may illustrate the role and relative importance of different architectural elements within designed spaces. This work would be especially valuable when combined with study of particular design languages (e.g., biophilic design) to understand how certain design strategies may impact the perceived gender messaging of classroom spaces. It should be noted that participants more than likely used their own devices to complete the online questionnaire which may have introduced variations in viewing conditions such as image display and color palette that could have influenced perceptions.
Conclusions
We have shared a quantitative study of gendered perceptions of classroom spaces along with data to demonstrate a measurement strategy aligned to our understanding of gender. Our two survey studies show perceptions of gender were measurably different via the classroom designs, and that perceptions of femininity and masculinity were positively correlated with one another in both samples. We also found undergraduates exhibited a strong positive correlation between perceived femininity in the design and their sense of belonging in the designed spaces. That correlation was very different with professionals, with the correlation exhibiting even more strongly but in the reverse direction (a negative correlation).
For designers and architects, these results underscore the significance of studying the potentially gendered impact of design elements professionals may not immediately recognize as gendered. These considerations illustrate the importance of user engagement throughout the design process, especially when user input may conflict with the personal perceptions or preferences of the designers themselves.
Footnotes
Acknowledgements
The authors acknowledge first the presence of the COVID-19 pandemic and societal disruptions occurring while we conducted this study. We appreciate the contributions of Tia Madkins, Katrina Rothrock, Dolores Greenawalt, Ricardo Millhouse, Lali DeRosier, and Bobby Nichols throughout our data collection process. We thank Multistudio team members David Reid, Lauren Maass, Anesu Dhliwayo, Sam Church, and NOMA fellow Linh Danh. We thank Andrew King of Fieldwork by Lemay. We thank Bruce Frey from the University of Kansas, with special thanks to Madison Joan Clark of the KU Institutional Review Board. We extend special gratitude to Sam Long, River Suh, and Lewis M-T Steller of Gender Inclusive Biology. Thank you also to Molly McVey and Kaitlin Salanski for their support and input. Finally, we gratefully recognize this work was only possible through the collaboration of Multistudio, Lemay, and the University of Kansas and the contributions of our full author team in their various capacities with the collaborating institutions.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
